From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71C01CD13D8 for ; Mon, 18 Sep 2023 16:37:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E48816B03E5; Mon, 18 Sep 2023 12:37:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DF8FB6B03E7; Mon, 18 Sep 2023 12:37:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C72006B03E8; Mon, 18 Sep 2023 12:37:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B9CF86B03E5 for ; Mon, 18 Sep 2023 12:37:13 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 84A79A0374 for ; Mon, 18 Sep 2023 16:37:13 +0000 (UTC) X-FDA: 81250273146.10.CAC9015 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by imf21.hostedemail.com (Postfix) with ESMTP id 8463B1C0014 for ; Mon, 18 Sep 2023 16:37:09 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=wovO6PVL; spf=pass (imf21.hostedemail.com: domain of Michael.Roth@amd.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=Michael.Roth@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695055029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jyzmYcLV+sptuG5IDyfXVowshZf+33f9QlVSHiv27bA=; b=5dLIRKzzqwXSWBqV//Ehlv3IObmF9aLjkydkENEUgMZ3lL5bUof5qTButNmAUSB2tCzweV WAkqdq+/66ewH63Ul1e4Y3UbBzMDlOOZRXdJMi0fBhKJJY8Z/JfYRFXxbzxKN1hl4SLzm2 mFeYMHiYvagb8WfSfZD1ar7whOZOb50= ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1695055029; a=rsa-sha256; cv=pass; b=llfaN+2k1ivXfG7ybM0i3k2p6e6OY/l01Y8ZblE1sKnW+UvOg9sE3cf03Pzui1jf0xho3g KcKODqvz8ZmsnHG+vszui3WQ5t0bW1Uk10mpD4hCBK0czNZzgKiJDHp9k5f6e35xW4V+zr 0BW4GQvqp+O2DDWyDaspQuuxu9cvp0o= ARC-Authentication-Results: i=2; imf21.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=wovO6PVL; spf=pass (imf21.hostedemail.com: domain of Michael.Roth@amd.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=Michael.Roth@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eu2RQKimjXlXW7W9cE+unZHe7lMxN724HXxOLWbhq0SWPNbOK5nnCkFu/t3OOrdPWjNcLSep/lwXYu2CiZ+7oF8en7zQNf5gkfFebShPO+uA7uY02CH/ibkQNM65aN7l3Tz84f0MIYvloiWzMn9xTNgvAzQYWS/FmAgg3texq3lVg+YB6YywO8nffffn8cdUFB9cKf4KE3lzF0ckvJOXlWZWwJ9PONy0j7YS9KxKG5eJiKJipOgI5RoRXEwDw/sEQwJkmfc0oPTM4pwJkRfyv/kjS/25l++YENPN3HZR/qfaY9+L7gcIWjCLNfJyD+FDoaeqAz+lPNs86SB6t3j9mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jyzmYcLV+sptuG5IDyfXVowshZf+33f9QlVSHiv27bA=; b=TrEAXZtGTAW9okbDr0jzgUxyPkg6ADla8SlxNh0Dq9CFtXP3A6MjUxu8oBsR8oEJ7ejBu1vIrDDplPochZhV0kXH7Iqs3W1HLa/mWwSeU1rbk5DQnNgxp8S5SPmqaBS2S5k76aSi18Y3EoMsniW044XCSvgYg58pJ8bYrxf/NUsDIn75cgMdGq6QFmm2QzGnex6QNIosH1nOVDgCgzESk1IDNZ+FHPNxbSfaGmlf57i0j5kVBQKv/eKINhHHpFB6e7JMrGD89NrbmuQ0XscC83M9Qm38IMNHpAXiCYk4wZumJNq0s5J/zCrp4iLo/NqFEGKYjR7xlNC1bVysKQGUpA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=google.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=jyzmYcLV+sptuG5IDyfXVowshZf+33f9QlVSHiv27bA=; b=wovO6PVLfH+XOvbKxZzF44pw7e/8zDeQedjg2pIOcEgAw3zkcg799SvqAph1oRX2v2fuL5hUvgaI878XikR4pOae11ICUFm9a6wPvWLKvkLkk0oAouoxcIlimllisDM0TOH2IsXW5ny6qFf4lx6WvXGl1GI8huDh2xObWCT5fCM= Received: from CY5PR15CA0060.namprd15.prod.outlook.com (2603:10b6:930:1b::6) by PH8PR12MB6820.namprd12.prod.outlook.com (2603:10b6:510:1cb::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.24; Mon, 18 Sep 2023 16:37:05 +0000 Received: from CY4PEPF0000EE31.namprd05.prod.outlook.com (2603:10b6:930:1b:cafe::4e) by CY5PR15CA0060.outlook.office365.com (2603:10b6:930:1b::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.27 via Frontend Transport; Mon, 18 Sep 2023 16:37:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CY4PEPF0000EE31.mail.protection.outlook.com (10.167.242.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6792.20 via Frontend Transport; Mon, 18 Sep 2023 16:37:04 +0000 Received: from localhost (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 18 Sep 2023 11:37:04 -0500 Date: Mon, 18 Sep 2023 11:36:47 -0500 From: Michael Roth To: Sean Christopherson CC: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" , , , , , , , , , , , , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Subject: Re: [RFC PATCH v12 14/33] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory Message-ID: <20230918163647.m6bjgwusc7ww5tyu@amd.com> References: <20230914015531.1419405-1-seanjc@google.com> <20230914015531.1419405-15-seanjc@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20230914015531.1419405-15-seanjc@google.com> X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000EE31:EE_|PH8PR12MB6820:EE_ X-MS-Office365-Filtering-Correlation-Id: 312b407d-0494-4860-7bea-08dbb8657e1e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Hf//+xF5X+HV33ANfZQIRomTlciR8+GA0BiVUWGu5B+F91TEL3LTJYqKqauYTW2CWFMAxpwIwjJGv+qSrZ0PPP0MBW5O1quxWbBo6OPKR7BdgHTT1fXM2hCgLqyOvn3zRuMkPByquecngUqCBCsHpOLzysOCDrQ6Pwx87Ge87D9mWg+VY+cTBansVNl6TrO0/qWypNLlzh74pJVWj4hgPOtFa6pqwYRGE58jL0VQpdxPZBNfE9mDOi62QUsbWgh/IWSiuU8UtK5z9woKU6PdR0jK1oc1kAHB3QNCmzMB0ETc7gaJR1sDjD/agQP4B6wddykv4CgE3/8OkUy6juVoVTBEdIbSJINERAmaGz87uEZ63HOmzvxktHELEbV+iXtGJjzIhMqc8dWSYuxtYZvOpFCH8wvEFoKUvxiGUY4HoFjZnfj52+8ABI2L8mKOAQRNvxw19Wh+V19pnoFe28qJ9l8WuieIdabWkgKezdpCAYkwGSN/tcsxiOEIq39xPsWg/O1/nYmH8yc11k02trlse7XBCYVH568XabbjBJikmhb27UDC5+a5gAf18rhMmXGYmuL6pXK0qrp2aJDUiUlY/0XQYlcP4lPlMjQspwiyaDbZsZYrxSWuHU7RKkc9OafZYkfcplyfdUYsytt6h+fW5ylA5TiARBx96xAaO15pR26vbouxBloAO2PlxRu5zdyOAwYmNEHk3EFM4eBQ9q2V4oyv38Gib1hNd2TPJErB/wNDhTAu5eZnyHFJyf/144wGWtQm8DWgspiXjKDKVESgvQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(376002)(346002)(39860400002)(136003)(396003)(186009)(1800799009)(82310400011)(451199024)(40470700004)(46966006)(36840700001)(26005)(5660300002)(16526019)(1076003)(8936002)(4326008)(8676002)(40460700003)(36860700001)(2616005)(2906002)(7406005)(7416002)(86362001)(82740400003)(81166007)(356005)(36756003)(47076005)(83380400001)(336012)(426003)(40480700001)(70586007)(70206006)(54906003)(44832011)(6666004)(478600001)(41300700001)(316002)(6916009)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Sep 2023 16:37:04.9098 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 312b407d-0494-4860-7bea-08dbb8657e1e X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000EE31.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB6820 X-Rspamd-Queue-Id: 8463B1C0014 X-Rspam-User: X-Stat-Signature: wpkoz699f3133w4dyxdwcpgh4xw8msbh X-Rspamd-Server: rspam03 X-HE-Tag: 1695055029-845690 X-HE-Meta: U2FsdGVkX19x/Q0qfHiCAskronQKsjOLy0IJ2yJ9ZOnTc8sYjdwvJypq3h8qSFtWUtklYyUGgCTH3nRM44eV6Hhc0pio0g2BXWqSrbNGMrOd4ArQfZhfKFvvKsaSa3LAn13hl3CxcBLYzn01dKXIHvfGU3ga6XtzBmM6qfPTqNZHFgPhJ2IF0Hd0xGEFAJfk4yoNoDnWn1p+DyVlBxZ/BOLYpTKlcoSFAkjuPFIX1xOZ+0Z8asTU/kUvRNKN+Xj0MbpcrRG/hLP4ElQzyloqQ11LPt2B1aYhz2dx5xScw+f2OYb+dLsSjnVRx0pCTISCefFBqFYxLJVAWxUo28W1BWx/rGy6TmDWrJspRNe96h2s50coL9xu4hoD+Rop/cFBxQKnU0p/KsQWYABSQyWTRjCUBudtqoolPEX4ha7SQ2dt00aVi9saCA/Rn6XLwrdDlxhiFbIbQ4SEwJTJUi9JVn13OUK39c0ztdRi7tTyXj4kZd2v26gFgPwzez1cdZyie06EjCUwb4ewBJ699a7zZ+Kb2HIao7AV2KE0WUvsfn//ZKWGYsVjenJIsm/aQzO28yxn5ZzX134WzRFWpb3v1s7Ak4NI7yy7CeECR9LajzhcFTheH3aRNiSvlhMmIiheBdWRl3EEXCWukGY57EDZPi2pdhzKy1wVnAHsH+8+vXZfMxZ2U1Wgq1KHzRGz4bUgpDji44uY5kHAOSiaYP0k4Kkvs1sxBH2l5QATVivdHFqcGGKrVf6dWEtV3bDdsrZulVoIHXr+2kWu2Lf4owdRNgB2h4RkaC9MUTgLHOu+Z5pGBPkUcimeqQ4/Fn7eOsjcP1+b/EOka7yWQ482YY5q8ZHcvbYI6o5f1pCy0U6ZJjl/ELIsUGoR+51rSE1gfS4LqA1c97vQHBdHPy5gfTiwxzmBgzMDRZKcWs9zez8wQe0tf7vsDFmWQQ62UnKhFN37cBsbj9JjntwG8ywS8GK xc/DfR24 FBYYwfGGwbuIK82GPvkocr1d6NB6Byf8V7HKeV151u6xSq6ilG4eS9Kr7Sar/pgahpmimKRikQNFWTyrn3CgzP+3veNPLRKxqS8VN9kWjiNgexecYeKwjUPKO+m9x3LCG47WSkdOdH3zGPqrMAyfqdD/mi/15kk5m5fz5Mrurfr7WgCHd8DF05XOzM1B3gb488RCsxlSVdof7qo8K/8ak0U7dpCj5hvf5j7jVRyeQbbkaLDvSOkvChDRMXtoNYd9W0e/TSJ4bW5JSq/zg9AagnywXEMHznVxUIHfY+Ylp7zgRkBM2Abpwb3J8OBrrhQ6Cpf5pVTe24F2ZCWMEjSz+0GHYlZqRM70IpV+iYV3Fgyh6/MfjlOUwLlZzwD2JuI4ISYFsmjO8I6WpHf/ocea4HNYWv/nBmWDWHTwnWVIpF7orZ04AGOpIIeDA9UIUuj8nMPQ0Y8YTsAJIxtBLlIniGaik6tGl8JOhANsEKdD8SFUbv9cngUG+oRoNvFxS8ztDlKHbRWE5rwIfTtcp3rlLjOctc5M32ztLVOtKlJn6BYwtQ0LRD8tVjWz2zQeoY8jyuySO48cfaoBr4LTzY4T1MJTrjXYVTH+5jJJQKRrBull4KvI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 13, 2023 at 06:55:12PM -0700, Sean Christopherson wrote: > TODO > > Cc: Fuad Tabba > Cc: Vishal Annapurve > Cc: Ackerley Tng > Cc: Jarkko Sakkinen > Cc: Maciej Szmigiero > Cc: Vlastimil Babka > Cc: David Hildenbrand > Cc: Quentin Perret > Cc: Michael Roth > Cc: Wang > Cc: Liam Merwick > Cc: Isaku Yamahata > Co-developed-by: Kirill A. Shutemov > Signed-off-by: Kirill A. Shutemov > Co-developed-by: Yu Zhang > Signed-off-by: Yu Zhang > Co-developed-by: Chao Peng > Signed-off-by: Chao Peng > Co-developed-by: Ackerley Tng > Signed-off-by: Ackerley Tng > Co-developed-by: Isaku Yamahata > Signed-off-by: Isaku Yamahata > Signed-off-by: Sean Christopherson > --- > include/linux/kvm_host.h | 48 +++ > include/uapi/linux/kvm.h | 15 +- > include/uapi/linux/magic.h | 1 + > virt/kvm/Kconfig | 4 + > virt/kvm/Makefile.kvm | 1 + > virt/kvm/guest_mem.c | 593 +++++++++++++++++++++++++++++++++++++ > virt/kvm/kvm_main.c | 61 +++- > virt/kvm/kvm_mm.h | 38 +++ > 8 files changed, 756 insertions(+), 5 deletions(-) > create mode 100644 virt/kvm/guest_mem.c > > +static long kvm_gmem_punch_hole(struct inode *inode, loff_t offset, loff_t len) > +{ > + struct list_head *gmem_list = &inode->i_mapping->private_list; > + pgoff_t start = offset >> PAGE_SHIFT; > + pgoff_t end = (offset + len) >> PAGE_SHIFT; > + struct kvm_gmem *gmem; > + > + /* > + * Bindings must stable across invalidation to ensure the start+end > + * are balanced. > + */ > + filemap_invalidate_lock(inode->i_mapping); > + > + list_for_each_entry(gmem, gmem_list, entry) { > + kvm_gmem_invalidate_begin(gmem, start, end); In v11 we used to call truncate_inode_pages_range() here to drop filemap's reference on the folio. AFAICT the folios are only getting free'd upon guest shutdown without this. Was this on purpose? > + kvm_gmem_invalidate_end(gmem, start, end); > + } > + > + filemap_invalidate_unlock(inode->i_mapping); > + > + return 0; > +} > + > +static long kvm_gmem_allocate(struct inode *inode, loff_t offset, loff_t len) > +{ > + struct address_space *mapping = inode->i_mapping; > + pgoff_t start, index, end; > + int r; > + > + /* Dedicated guest is immutable by default. */ > + if (offset + len > i_size_read(inode)) > + return -EINVAL; > + > + filemap_invalidate_lock_shared(mapping); We take the filemap lock here, but not for kvm_gmem_get_pfn()->kvm_gmem_get_folio(). Is it needed there as well? > + > + start = offset >> PAGE_SHIFT; > + end = (offset + len) >> PAGE_SHIFT; > + > + r = 0; > + for (index = start; index < end; ) { > + struct folio *folio; > + > + if (signal_pending(current)) { > + r = -EINTR; > + break; > + } > + > + folio = kvm_gmem_get_folio(inode, index); > + if (!folio) { > + r = -ENOMEM; > + break; > + } > + > + index = folio_next_index(folio); > + > + folio_unlock(folio); > + folio_put(folio); > + > + /* 64-bit only, wrapping the index should be impossible. */ > + if (WARN_ON_ONCE(!index)) > + break; > + > + cond_resched(); > + } > + > + filemap_invalidate_unlock_shared(mapping); > + > + return r; > +} > + > +static int __kvm_gmem_create(struct kvm *kvm, loff_t size, struct vfsmount *mnt) > +{ > + const char *anon_name = "[kvm-gmem]"; > + const struct qstr qname = QSTR_INIT(anon_name, strlen(anon_name)); > + struct kvm_gmem *gmem; > + struct inode *inode; > + struct file *file; > + int fd, err; > + > + inode = alloc_anon_inode(mnt->mnt_sb); > + if (IS_ERR(inode)) > + return PTR_ERR(inode); > + > + err = security_inode_init_security_anon(inode, &qname, NULL); > + if (err) > + goto err_inode; > + > + inode->i_private = (void *)(unsigned long)flags; The 'flags' argument isn't added until the subsequent patch that adds THP support. > +static bool kvm_gmem_is_valid_size(loff_t size, u64 flags) > +{ > + if (size < 0 || !PAGE_ALIGNED(size)) > + return false; > + > + return true; > +} > + > +int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) > +{ > + loff_t size = args->size; > + u64 flags = args->flags; > + u64 valid_flags = 0; > + > + if (flags & ~valid_flags) > + return -EINVAL; > + > + if (!kvm_gmem_is_valid_size(size, flags)) > + return -EINVAL; > + > + return __kvm_gmem_create(kvm, size, flags, kvm_gmem_mnt); > +} > + > +int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, > + unsigned int fd, loff_t offset) > +{ > + loff_t size = slot->npages << PAGE_SHIFT; > + unsigned long start, end, flags; > + struct kvm_gmem *gmem; > + struct inode *inode; > + struct file *file; > + > + BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff)); > + > + file = fget(fd); > + if (!file) > + return -EBADF; > + > + if (file->f_op != &kvm_gmem_fops) > + goto err; > + > + gmem = file->private_data; > + if (gmem->kvm != kvm) > + goto err; > + > + inode = file_inode(file); > + flags = (unsigned long)inode->i_private; > + > + /* > + * For simplicity, require the offset into the file and the size of the > + * memslot to be aligned to the largest possible page size used to back > + * the file (same as the size of the file itself). > + */ > + if (!kvm_gmem_is_valid_size(offset, flags) || > + !kvm_gmem_is_valid_size(size, flags)) > + goto err; I needed to relax this check for SNP. KVM_GUEST_MEMFD_ALLOW_HUGEPAGE applies to entire gmem inode, so it makes sense for userspace to enable hugepages if start/end are hugepage-aligned, but QEMU will do things like map overlapping regions for ROMs and other things on top of the GPA range that the gmem inode was originally allocated for. For instance: 692500@1689108688.696338:kvm_set_user_memory AddrSpace#0 Slot#0 flags=0x4 gpa=0x0 size=0x80000000 ua=0x7fbf5be00000 ret=0 restricted_fd=19 restricted_offset=0x0 692500@1689108688.699802:kvm_set_user_memory AddrSpace#0 Slot#1 flags=0x4 gpa=0x100000000 size=0x380000000 ua=0x7fbfdbe00000 ret=0 restricted_fd=19 restricted_offset=0x80000000 692500@1689108688.795412:kvm_set_user_memory AddrSpace#0 Slot#0 flags=0x0 gpa=0x0 size=0x0 ua=0x7fbf5be00000 ret=0 restricted_fd=19 restricted_offset=0x0 692500@1689108688.795550:kvm_set_user_memory AddrSpace#0 Slot#0 flags=0x4 gpa=0x0 size=0xc0000 ua=0x7fbf5be00000 ret=0 restricted_fd=19 restricted_offset=0x0 692500@1689108688.796227:kvm_set_user_memory AddrSpace#0 Slot#6 flags=0x4 gpa=0x100000 size=0x7ff00000 ua=0x7fbf5bf00000 ret=0 restricted_fd=19 restricted_offset=0x100000 Because of that the KVM_SET_USER_MEMORY_REGIONs for non-THP-aligned GPAs will fail. Maybe instead it should be allowed, and kvm_gmem_get_folio() should handle the alignment checks on a case-by-case and simply force 4k for offsets corresponding to unaligned bindings? -Mike