From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 453ADC369CB for ; Wed, 23 Apr 2025 22:02:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 07AA36B000C; Wed, 23 Apr 2025 18:02:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 003506B000D; Wed, 23 Apr 2025 18:02:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E35D16B000E; Wed, 23 Apr 2025 18:02:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CA5256B000C for ; Wed, 23 Apr 2025 18:02:07 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 27FE5B6011 for ; Wed, 23 Apr 2025 22:02:08 +0000 (UTC) X-FDA: 83366682336.09.D0BB79B Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 2B1D61C0018 for ; Wed, 23 Apr 2025 22:02:06 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sUJcUG3J; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3XGMJaAsKCI8tv3xA4xHC6zz77z4x.v75416DG-553Etv3.7Az@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3XGMJaAsKCI8tv3xA4xHC6zz77z4x.v75416DG-553Etv3.7Az@flex--ackerleytng.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745445726; a=rsa-sha256; cv=none; b=PFWJrhIfCi8vzYJNjannhWAZbpOiMpm6xD+vcuMhBHiWENMGLtuhIXmFJPY4ySTGhRCHli nHEF4cWtA0OMfAcBxvwAcnWkTYHHPIxssELsfryHI2ZiTnQ/JHis3uPFBUHLYjDgcrnUth cFxCIR6lHlGuOo2e/fzAC8XVFxMK4Kk= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sUJcUG3J; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3XGMJaAsKCI8tv3xA4xHC6zz77z4x.v75416DG-553Etv3.7Az@flex--ackerleytng.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3XGMJaAsKCI8tv3xA4xHC6zz77z4x.v75416DG-553Etv3.7Az@flex--ackerleytng.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745445726; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZshVHDiSVI1msI3W+ro+KzMPJMPZ2JcnJbSD4cGabVM=; b=v3t2UAjiFnpETLaKe3+mqEcftBsJQvD3nKtUIn69myFhHw3Fax8M43+VWRrdNlEbFvSHie ASobZqej7bar6sVS/CUYKPQ8kRbGUdAOqcMiq/JDPR8VBBDxoK1ovbfjdq4EDrtCSfoTfA INQlXxgbWvlsi5bTyADEc98mptn/Hzg= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-225ab228a37so2189825ad.2 for ; Wed, 23 Apr 2025 15:02:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745445725; x=1746050525; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZshVHDiSVI1msI3W+ro+KzMPJMPZ2JcnJbSD4cGabVM=; b=sUJcUG3JV8LfKFraXYR3TIS8jNFoa2rd5PL5qW4c+Q6okDf60zULu9Z6bEZwpwaHyj Cq13qdNI5DlrwVb9MGCL2F5uHuesKpmzMejkin1cFlLr96caQbLL66gmFDQMKfni7HyI 8CIHNqWsRf11O2NN4pNm3Jal72hbeF8qO2gkosDgGbS9NjHOhbg8gAG/9ByDXskDDB73 KffXG3mGI2K+dwyT+sHLWiP8DbloREPyVZXdFY/wZNRGHPtolKNghn582MgqmE0J5KNZ WhxLA2GpC1uQZV/RTNYf+yjq3NqMOfSrUqqkqJVUItLankPP2nJdDQ+ieGB7wMtG0618 sz1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745445725; x=1746050525; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZshVHDiSVI1msI3W+ro+KzMPJMPZ2JcnJbSD4cGabVM=; b=AamTrf1jsEBTPHttbAHt65hgM4ZXJC1epKvIWSReiW1HpTep7hcXcHjpD0HeEu6Jss hNxdgJMC19oULyZKnzYSE14bxmnANyqzzVDUErEg1B57EcyZap5EHopgbBIXseqg9IqN ItuWtuvSGRKDP2b8CbPgOHKIZatdVV7uQ9bjwKLczEgzy7eYWrAGyJdPH3wxmppAM90i QWIOLaweO+GZKgSkyCeCCJsmOpcEaa+2ZiB0+CA5ASU5LupOP9956Qlvpf8WO0ZI8BNW NdOA1UGTosAwZ9mdQSKN4MirjpW+cLnDzmLbRXbElJ89Lu1wefVQ1Ocu5dwYCH58SMQH yDQQ== X-Forwarded-Encrypted: i=1; AJvYcCVXT7hiFj3aE7edzyswP0efitrdqHWNyqFqxdW8g5ZZtBEo8UwKDxCZMMuGwLWRUsfNyPZRlglh3w==@kvack.org X-Gm-Message-State: AOJu0YzyiwN6pNj3dlNH9T24OryQtzgLwBe+H0kLlk2kF19Sm4SuxmVR EphhhWdN/R4sPe6oDRQqNmGI8MMCfvtkn1puvjmEroKfuiv7DxIWOJoB0q6juW9nQR+tCdVwX7d Naz257pR88+AVC1yvJXEVcg== X-Google-Smtp-Source: AGHT+IG47bYyoQqSmgswCKOYeGTrnVac079gXJUVu/1TXbrM7dvGQWThFMm4vyGtqs9cP9ZCw5DfCSc3oxRjiC395w== X-Received: from plcx19.prod.google.com ([2002:a17:903:d3:b0:229:1de5:3212]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e551:b0:220:ea90:191e with SMTP id d9443c01a7336-22db3baf262mr2268985ad.4.1745445724608; Wed, 23 Apr 2025 15:02:04 -0700 (PDT) Date: Wed, 23 Apr 2025 15:02:02 -0700 In-Reply-To: Mime-Version: 1.0 References: <38723c5d5e9b530e52f28b9f9f4a6d862ed69bcd.1726009989.git.ackerleytng@google.com> Message-ID: Subject: Re: [RFC PATCH 39/39] KVM: guest_memfd: Dynamically split/reconstruct HugeTLB page From: Ackerley Tng To: Yan Zhao Cc: tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, peterx@redhat.com, david@redhat.com, rientjes@google.com, fvdl@google.com, jthoughton@google.com, seanjc@google.com, pbonzini@redhat.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, isaku.yamahata@intel.com, muchun.song@linux.dev, erdemaktas@google.com, vannapurve@google.com, qperret@google.com, jhubbard@nvidia.com, willy@infradead.org, shuah@kernel.org, brauner@kernel.org, bfoster@redhat.com, kent.overstreet@linux.dev, pvorel@suse.cz, rppt@kernel.org, richard.weiyang@gmail.com, anup@brainfault.org, haibo1.xu@intel.com, ajones@ventanamicro.com, vkuznets@redhat.com, maciej.wieczor-retman@intel.com, pgonda@google.com, oliver.upton@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Queue-Id: 2B1D61C0018 X-Rspamd-Server: rspam04 X-Stat-Signature: 877sw64s48ncb9g8n1z1x5jfgbrjfyzi X-HE-Tag: 1745445726-226962 X-HE-Meta: U2FsdGVkX19dywhT8j1kkUbcYHylBxIG+F4YpZ5ECXzKsya7vgh6EbIQJzqntOyAihD3lQSmTISCAacAtRniYwtyUBamGSoMzWnB/Y0ZDySJYautN7+69cLVgphMGBS0R1AeM/zevAVZe9iIoQxykVc4vo4qDIl9+d727zJ5p/2TZTND1C9z5Go39SDyguHIVtY34ZTgSOikGIDOjmSZXZZXeCYDp4CfaYIA5zfIaDd7j7j7Yezi+dBuJ+9+NKl+c71qjFsM8wzFQ+IHx83AKihWUhNdmWc91zJcpCwuWkxYAIVlxLD4DRYBMUqMiOjLDmDlt6Amfx2m3g91DrUN79Ux7x8CHyMLmeFSBWCcHYtH6O7X6DAWmxZQCxSW8Z7ZtRjAtvV4LcenK+DWeaecsS/YfU/G6BBaz/GZI/8h6KgVPCcbtau+yQiaQQQo0hqXrbQazUV08MqJ5RQtEYqvmTjgVSHismuv8Wrey19rleyaVC0tp7FYMuqR3a2PKl7d30SVzt50ysUzuDDnNXxItH/FERos+YYXdFUIQulCvfpQBax490St5aSPIL+BUdMJcYncCPuDTPdD0fG7K/hqEZYlEWS/TcRdgpFVnQi1g4xKDwRNxZBZ2AZ8rjKzsMMf8ewolXpt6L6gLR8lTKxENh4QFQvLW9zQ6DeEQfcb2fn+zTh/eV/LDGNagRJIiOnsbP/V/WJqTl923w4tP86Oa+prD6wITetY4fKj3ZcXs5ZaneJqd96wBW40KyuQfeNbqN/wKDneocYilbNKkENXWNwW9SgGPhZFcn9PIc40k7zqn8d73jEgY78PNhxqkib2xyHA/r8kelNY0p8x9oDo4Nm14VKFal5ICdEX+KCkT3JHVK/dxyLMa/JgTiPg+zek+Xb59N+cc9bRQ45gZwQE0bgsyfF74P5q4LdcFbgbBp8RIlUb1BtA5OHLyhgU+It50dv1sihQpAeaa4ajwKz uCeteFSW UBzX5g6DAACjdsanNGpIWZxxcyBJzDflTnRwmOcmWdJX6fa/3kZhwo9r5YoWU7SknmgSX0u/nOAlv6LUFQjv3SrGnz/m8xyW2UX1hgnIrHyM/sSIui4FWSkw4cNWjlxQFtTSsawZtyhJCmX5OdDNRK79EyMz6Pe21C8Bq1jET5GxoinOtGjUnYu249uhq/X4Rpr12/odUU96AM4T9E0xiM1EJzO6xM5q2rUHa5bmRbqjYGSnqWdhI8n2fHzeBxIdGh7fqqtbHQiOMLoIDYyMQ/h26gb9Bb4XusWC0DV/tEy0+CTnJ1IjFjMEA8Xgl638pVy7s8o9gLNUQFGzZ1dGyxpbE5hEDh1Zb1AMN4To/mDjfeRASrBYFVAjgD8dRQouXLjpPXeKmPgADxiW4uOs+loi4SChsAgTlzkHxHMm14x9i+DXtCY6WlG5tR9y6GQQkjL/RKH45MJHf42YKni4RzmWMneVLVrb+UBZP+4GGMPYoNQtvQFgIG3ulOFAh8fZl8NOX14ezySTKV1ShkXw67i0iOqZ80SijG43mbQ9BipVrOSs7GfbuXVTovmhNQb+7Xuxs2uhR5d4FlSmEX+zRIB0ChPQs63Ge/2XC0e6rrIFtwbsVCjL0Ng5bmNRTcxyESKCdjArG7vRS5JhVmh/Y9QQl2Os5uagS/0ydpCrI54Hz8CMk2J5p4sNYfQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yan Zhao writes: > On Tue, Sep 10, 2024 at 11:44:10PM +0000, Ackerley Tng wrote: >> +/* >> + * Allocates and then caches a folio in the filemap. Returns a folio with >> + * refcount of 2: 1 after allocation, and 1 taken by the filemap. >> + */ >> +static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, >> + pgoff_t index) >> +{ >> + struct kvm_gmem_hugetlb *hgmem; >> + pgoff_t aligned_index; >> + struct folio *folio; >> + int nr_pages; >> + int ret; >> + >> + hgmem = kvm_gmem_hgmem(inode); >> + folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); >> + if (IS_ERR(folio)) >> + return folio; >> + >> + nr_pages = 1UL << huge_page_order(hgmem->h); >> + aligned_index = round_down(index, nr_pages); > Maybe a gap here. > > When a guest_memfd is bound to a slot where slot->base_gfn is not aligned to > 2M/1G and slot->gmem.pgoff is 0, even if an index is 2M/1G aligned, the > corresponding GFN is not 2M/1G aligned. Thanks for looking into this. In 1G page support for guest_memfd, the offset and size are always hugepage aligned to the hugepage size requested at guest_memfd creation time, and it is true that when binding to a memslot, slot->base_gfn and slot->npages may not be hugepage aligned. > > However, TDX requires that private huge pages be 2M aligned in GFN. > IIUC other factors also contribute to determining the mapping level in the guest page tables, like lpage_info and .private_max_mapping_level() in kvm_x86_ops. If slot->base_gfn and slot->npages are not hugepage aligned, lpage_info will track that and not allow faulting into guest page tables at higher granularity. Hence I think it is okay to leave it to KVM to fault pages into the guest correctly. For guest_memfd will just maintain the invariant that offset and size are hugepage aligned, but not require that slot->base_gfn and slot->npages are hugepage aligned. This behavior will be consistent with other backing memory for guests like regular shmem or HugeTLB. >> + ret = kvm_gmem_hugetlb_filemap_add_folio(inode->i_mapping, folio, >> + aligned_index, >> + htlb_alloc_mask(hgmem->h)); >> + WARN_ON(ret); >> + >> spin_lock(&inode->i_lock); >> inode->i_blocks += blocks_per_huge_page(hgmem->h); >> spin_unlock(&inode->i_lock); >> >> - return page_folio(requested_page); >> + return folio; >> +}