From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22FF3C369D1 for ; Fri, 25 Apr 2025 04:05:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D22C06B0005; Fri, 25 Apr 2025 00:05:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF81A6B0006; Fri, 25 Apr 2025 00:05:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B253D6B0007; Fri, 25 Apr 2025 00:05:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 89AAE6B0005 for ; Fri, 25 Apr 2025 00:05:27 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A557A6A2F3 for ; Fri, 25 Apr 2025 04:05:27 +0000 (UTC) X-FDA: 83371226694.12.468148E Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by imf07.hostedemail.com (Postfix) with ESMTP id 8237B40006 for ; Fri, 25 Apr 2025 04:05:23 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nRiC6T5w; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf07.hostedemail.com: domain of yan.y.zhao@intel.com designates 198.175.65.12 as permitted sender) smtp.mailfrom=yan.y.zhao@intel.com; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1745553924; a=rsa-sha256; cv=fail; b=XJuobctV+YwA70N3mlptV/2frEaDQUnWT0+uHNQyGtGQh90Q9BXOZ0KAvCDLFXKB8M2OUP Ws/bUMxjEjbGbN6AiV5o5BbpJ0R+X1JTqdCRrw9oAz52wutWgO6DuFjaShkuwoJWJH+pU3 D5TZemaNgnAoI427vZw2magHiEwFfhA= ARC-Authentication-Results: i=2; imf07.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nRiC6T5w; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf07.hostedemail.com: domain of yan.y.zhao@intel.com designates 198.175.65.12 as permitted sender) smtp.mailfrom=yan.y.zhao@intel.com; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745553924; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=64NUIuj0/mJn1A6FXyzbKthJl4IZ4SikzSZ74Rn7UxQ=; b=GTywp8/4cmux4fWi8qlOsMFA5evCe8FyBd+R2UE0uJDIASVxFIv2sVYAK0IADnI5fOSgHz hhnXtOULGz9Szt0BXvlUWSL2Cmop961PQEuuyM54NFdj1aZAT0ZPinWstth/DLxR6UQEVW N7ALYM13EHnDKhgksEtu3CC3Or39tjI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1745553923; x=1777089923; h=date:from:to:cc:subject:message-id:reply-to:references: content-transfer-encoding:in-reply-to:mime-version; bh=gB3vqbWrovIcdC2E+9Oz89F6oAO91YCLZYfuCBxcxUw=; b=nRiC6T5wvViMm1Ks800gZp67V5iONkcSsAP2jYHHumjN0kKxkjBJCTXD SM4E3I2vRFeC25vNv18W8vu+0Ke5yhIQv0boedgn4xWJs8YLRUDn/3UZw 59YRLqkDHKJ2zox2uzSY0t1/6uTsHV/iZelrDHYoWqJUp/rVKlkCb0Fyl QXn+KIWFA2f4bK8YeC7DXbp6jnIfxJUF6xR3zZDDPGgm5PlAhdN+uROq9 dEoD4PnzphixDUxN75uK2MpFI0lJtjcdw9jot/poc+cGLNNuB5AoCh7DW /SWnvMKvK3y1NnBDkdLYAOvNr4RQe4s3SDgGW61+qQgec9+1F8g6F1YBb A==; X-CSE-ConnectionGUID: ggzFnPxQSeGAfHbbuU8pow== X-CSE-MsgGUID: iT0LUPVxSGGEBCFBQpSmvw== X-IronPort-AV: E=McAfee;i="6700,10204,11413"; a="58572853" X-IronPort-AV: E=Sophos;i="6.15,238,1739865600"; d="scan'208";a="58572853" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Apr 2025 21:05:21 -0700 X-CSE-ConnectionGUID: Lu39YxnpTWyhcDhmIHkxKQ== X-CSE-MsgGUID: 1eVr8877T42NMfAQAlNDmA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,238,1739865600"; d="scan'208";a="137959479" Received: from orsmsx903.amr.corp.intel.com ([10.22.229.25]) by orviesa005.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Apr 2025 21:05:21 -0700 Received: from ORSMSX901.amr.corp.intel.com (10.22.229.23) by ORSMSX903.amr.corp.intel.com (10.22.229.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Thu, 24 Apr 2025 21:05:21 -0700 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by ORSMSX901.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Thu, 24 Apr 2025 21:05:21 -0700 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (104.47.66.49) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.44; Thu, 24 Apr 2025 21:05:20 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Zjv9VpQ8YfTABkMZ8d6VuLfKwo1H3nJdw+czF95ZpwyCCRDY9YeNyGs4xo7AhUDaSyu5sja/xzVqJvakpc6dteXTa6YTxx16yzpt1DxHvPAfjPpZugrkkCNzD188GA4tHui0v3BL4k0UVxsp16wqlluSBFcUa1yMk6TEEvCztu2UIh//tPZbHQMk2zEW0VTHlVyoyFNQcJbFEDzyojGAw+8/3Cm83mHmnfw4aNJqD4O0nGVe/iz6UfGJTMzHV1eMpnCzsjSi+TXkvUowgmW7hp8b26dpnoWem8tJqlOCrgMHU2dHc4UIpKmkQ7Lpv7wtCGQtUrqG8RGrMjolCcXtpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=64NUIuj0/mJn1A6FXyzbKthJl4IZ4SikzSZ74Rn7UxQ=; b=uvUkxdO96Sod+er6PKy1HiT7ViyvWTVQfU3ii+5Q+mR5jJY2rENnIGhiP+RXzAEzR/IbXbsmJdtQuuvOCHef7j33ipFRlmhKGD7hajHcjlxN4zhNFxb0zO0z8Klep63oM46gO4zBB4qQnh9fdH6e/L0WkAC/uhKvnGN0fp6hZkw33gUp6z4gsD3HvCRs7d9YmQlRIDbqfnaw40AskXtiX1F5xOC3tGAeFX5hWzd0pTfB2doxbyZIQev5vw1z000K0BGSiPWYPPJAQaotIYHE6QD5skHV+Jy84s2RYirmNPg0brTV+lOTpK5bNQJft1UK0r5CAqcvhUvksfGmQxQ0gw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from DS7PR11MB5966.namprd11.prod.outlook.com (2603:10b6:8:71::6) by SJ1PR11MB6203.namprd11.prod.outlook.com (2603:10b6:a03:45a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.35; Fri, 25 Apr 2025 04:04:48 +0000 Received: from DS7PR11MB5966.namprd11.prod.outlook.com ([fe80::e971:d8f4:66c4:12ca]) by DS7PR11MB5966.namprd11.prod.outlook.com ([fe80::e971:d8f4:66c4:12ca%6]) with mapi id 15.20.8678.025; Fri, 25 Apr 2025 04:04:48 +0000 Date: Fri, 25 Apr 2025 12:02:46 +0800 From: Yan Zhao To: Ackerley Tng CC: Vishal Annapurve , Chenyi Qiang , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: Re: [RFC PATCH 39/39] KVM: guest_memfd: Dynamically split/reconstruct HugeTLB page Message-ID: Reply-To: Yan Zhao References: <38723c5d5e9b530e52f28b9f9f4a6d862ed69bcd.1726009989.git.ackerleytng@google.com> Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-ClientProxiedBy: SG2PR06CA0189.apcprd06.prod.outlook.com (2603:1096:4:1::21) To DS7PR11MB5966.namprd11.prod.outlook.com (2603:10b6:8:71::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR11MB5966:EE_|SJ1PR11MB6203:EE_ X-MS-Office365-Filtering-Correlation-Id: d950e5ce-6b10-43eb-45d1-08dd83ae5206 X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|366016|1800799024|7416014; X-Microsoft-Antispam-Message-Info: =?utf-8?B?SXBzZ2ZDTmRncnY5eVF2MUVOakNvUURKbndISU9kbjRYd0pPUHMyOGJkdE8x?= =?utf-8?B?UUlOWVpTdmR1ZEp0SVcrSmtUeDNUMXZHbTdOVE1ZQmllY2crZTFDY1NkenZp?= =?utf-8?B?M2dYMU9YdDlIbndvY1h3a3BFeVRmQnJtTVRRdHdzTlNqSkp6N1dkMUh4Q3lm?= =?utf-8?B?QThqRFlvdERMOGhYTE9NbjNSU2tjUU54WTd2OHRZb0dHcklsek9ZeFc0TzNM?= =?utf-8?B?ZU81OWhQMXg1TWFiZlpodFRYazNBSXY2RnMxTkx6RmRTcm1ZSzRlZjFWd0E0?= =?utf-8?B?SFgxQmNDOGdiUDVJSzhIZ2hnKzR1WFVDUmpQck9taU81RHIva243b1Nhbzhy?= =?utf-8?B?V1VyeDN6N0lMbzRVK0k2WGNLakhjblFKbURZcjZkZC9iQi9RYkdpQWVSNkpj?= =?utf-8?B?Zzk0UHR3RE9Ja1dSd2pCeTUvM3d6VnltWjBpbEt3MnZLR0FPQ210bGUzRHlT?= =?utf-8?B?OGllOXcwSFBWTnRPUFFKTHZqQWpzc1dINFlwQW5TaUZkTlUwQnJGakt2MmNt?= =?utf-8?B?UGlYVmp4QWNqSk96Z0RsamRWeC8wcmxuSVhrSUFiWm9seDR6aldxY2NBbXpF?= =?utf-8?B?bk5wZWRldG9ZUVNnbkJkY0hXYVdJVmJ0c3kwM1JZN1A3YUVPY2FJRHlCTWJ6?= =?utf-8?B?dGtHbEpEODdOcUhWTjdSMGVMd2N2cHFPby9ZS1k2QTNlTkp1V0hTNGltR2Nt?= =?utf-8?B?UXo4aWh4ZC91cktoM0RaUkNQdUovOGRJeW9EWEJ3NWNtMDExV0FkOCtqL25D?= =?utf-8?B?Q1BNeldid0d0Z3NPVjZPWTBiYUJKU3VmSHVxSysyVHU5TGl6S0hCSGdzWkZx?= =?utf-8?B?dEVCYWE1YVJqUiszdkJrZGQyYllrVExad3lYRFY1bEtSZm53WHg1aDlFeUVa?= =?utf-8?B?WUFNT0hNeHMvWnhKangzNENYRUcwLzZIU1g4bjdWcmp6U2RvMnRGV3VCL2ZI?= =?utf-8?B?NW8rT1JSWVZ2MTh4V29LSSt6WmFVV3VMck03aXVESi9BWVNhTWxhR2p0dFhO?= =?utf-8?B?MUlQOFh4cDVNTHh3R1ovMUNodkphY3JiVlMrMTJCQ3IrNkdReWFubENuRFRW?= =?utf-8?B?Vjg4U0ExS1pUWHNPR1ZhWEs0cS9yTkdDYnAyNTc1OGRyc29jWlRUVlM4Y1Zo?= =?utf-8?B?M0hUMjRBeGM1emxXL3lkNS9DenhRRnpqWWdGVEFzOHp3ZythR04xb1dSQW9C?= =?utf-8?B?ZTY5MmR0TW5mUFlyaGJwSW9EMWl1M0lhcXdoNzkzdWZpdnlyaS9PL3MySFFP?= =?utf-8?B?M3RWc3NCSWRyNnJaSU1LbWJKV1JmSHVtNDk2VUdtaFFNRFVCNGVldlpHbS9B?= =?utf-8?B?Y1Nzc2djYWxZamxVWUNzcVdJbXNTa0t5dms0USsyNUQ2VmFpTDl1ZHRzTHA1?= =?utf-8?B?S2tQLytVSVAyT1RXaENRQ3c2cGwreE45ckhmVGo0LzlpbjMxUzlrSmVhSCtE?= =?utf-8?B?cVpZUVJWenpTbEtXMk1oWEFmMDNNMmt1djRObllNS1JuaHBxNFlxUG9na3Ro?= =?utf-8?B?UGY5K0JHd1pRMHJpa21ucTdpNTl0YlJPbllWVUZlNjhXbldSTnpDcXZ0TmN5?= =?utf-8?B?Sk9EMVRqT0hsaGZSSDJFNDZIcG8vSFZZTXJmbFIzZzBjTnRZcFg2aEtxeWdI?= =?utf-8?B?WlZjeU0wWVhCNmVnZEp1UWZRNmlMRUtBUVR5TnAxQ1hPSmNqaU15MHBjRUE3?= =?utf-8?B?Nml0SUVjdGFja0FsSzExb1VGNWswUnJuci9DQjdQeUx0bjNZODhmYnJURE4z?= =?utf-8?B?djJFaCs3V2lCbG5EbHR6RHhqckVkSzUxeVUvRExqby90c05RZXp6NW15N3c1?= =?utf-8?Q?HDR6aczf9cBcLoQpPRUl5eb3+Jc/LPGMpf/tI=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR11MB5966.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(366016)(1800799024)(7416014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?WnVMSjNNWHE0OWpXTVl6RFBkUWYwdEhOcDVXSzFVdC81Wk1vLytWaGlGNmR5?= =?utf-8?B?TTZES2hmOExLbUFINVJ5bG1ZYk5Ja25EL3RmbFlkOXRnUGZza21jSXpBRCtZ?= =?utf-8?B?UVVrMzBKRkdwSXE3VVoyaGhWMG9NaWFRMC9aSDBHcDl6VHpBMXVGdTJ1eVVZ?= =?utf-8?B?STJUU3l0TGNvMTZabkw0NjVFTE4vc1I3RHNQRGp3bzhONWN6ZUpBOXBIczVr?= =?utf-8?B?bVdGOXczTnZJQ3J6TUFDclEwZEtXSUhyMnBQR2dYV0tKckFudEJ4NVBlbVFW?= =?utf-8?B?SWdObWE0VThTVGlIRkRXejY4UVJ1RC85YUw0RGpiY2o4Nk5yS0xRcFdBRU5z?= =?utf-8?B?d0hnamdQK3ZiYlhxUjZqNUZIL1JWMUNVeUVMaHZrRHJVWjFNek0rLzRUY1ZF?= =?utf-8?B?b1lqNkEwcmpydDIzblZndGZRWmhMR0VaYUV4MWZ4aU1lcFNGZTVLS1FUbmJX?= =?utf-8?B?TlhBU2JydHQxU0lLeXJORkVndExQcE1Ldm5BSzloZHZSejlTMVlLZlBCdDV0?= =?utf-8?B?LytVTVJOdmlCblZTZ29nSmFodE9kdGczUUJvWjQ2ZUFlTzg0Z0F0c2RtV1p2?= =?utf-8?B?TnRSaVFqbGU1RE4zUmhpa1l4UmtwVGp4WDJpUDBTdnNsZ3hxS3hGcWJDSFlD?= =?utf-8?B?aW9OSnJxVFIrYUdJZTg3SVBzK3V0dUJDNlgrakhyVzVzTngwZXJkQXR3YmJn?= =?utf-8?B?Q2Eyd3I1VHFvRnpiRVlhUDg5eHl2VHBWQWtkQUswbHl0NVdndEdyLzNlc2JF?= =?utf-8?B?N0txa3VBSEVyVElib3duMWltT0dONU5mdHRnYzE2bzNjZkFUdk9OVnl4L3F4?= =?utf-8?B?NE8yaHZhM29mZ2xGTWp4Zk5jaFdGeUFLNDNZMC9EUlB5dld5UjMyTEFlMVFa?= =?utf-8?B?NzEzUlUwOVR6QzFNd1lFOEVPMWVsR2RkNlZxZTN4dk9wR2REeTZWemsvd29I?= =?utf-8?B?Ly9RUFY4ZW5lK3BrQzh0Z2RFYXRXYXFWR2V5Mm1ySExNODVYaXJINDVKenky?= =?utf-8?B?MnpHbzF6NERteWFlZ2k5YktFUkptb0JWcE9aZ3RMQy9SNlNQaGVBMGd5QWVE?= =?utf-8?B?R1Jwb2NVR3RzNi9Pc3ZRZUJrQk5OT2s3N0xTZy9naG4wMld0QXF2MGlFVDhZ?= =?utf-8?B?cWVNZGtzOGxvVS9BR0E3SXhnNXlNaWxPd2tiOEtzcXhFc2VWZ2VCb2ZEenBN?= =?utf-8?B?dGRFK2gxSXpzMVp6SENWa1c4YjBFOVE4QkxQbEdlTkhVVGJYeVdNUlI5clNk?= =?utf-8?B?S041VDVCMWdCNlRndFMrbWhMM1E5NUQyWUd5c1l4SjdJbVBpM0Y5RU5EUGlw?= =?utf-8?B?MHJzZzd3bmJDcWE5dnI5VkNVZXAvM3ZPdTNXSUkzNDNwcUkxSXpPV1JMNmw0?= =?utf-8?B?RnVwUU9IU2t5eTN5Q3ZwbzN6S0JPemZqZVlyV2NPZUpiYm1ieVZWUXhLTi9E?= =?utf-8?B?aUtrUmtWUnNabjN5MWhRdVI4VmhDeFB6NjlJL0xwdWY0WmFBQ1NTY0kyU0F3?= =?utf-8?B?Y0tBM1RnMXV5WFNsL1JZMyttSEJhUHNzQ2ZlQWRGRDBwc0hpTTZZYnhGNklr?= =?utf-8?B?L1BvVWZ3VmhCR1V6MFVnbGhwMHZranh2Z0kybDV6Vko1c3ppM2hiM3AxalU4?= =?utf-8?B?YVU1cWhFNUxVODFZMnFIQzJFVlhPaEQ4TzdRWWtrUnhkQjNyOTJFbnZrVTcw?= =?utf-8?B?a2NqdnlZL3RMQ2lJMkhrSnh2aDByYjVjR0dwb0hkQ0hHbGl1REhMejhFdzMx?= =?utf-8?B?SUM2L29iY2VjWGZRaVc5SzhpSG4vL2FZK3g2MzJPOUQyTUV3UjRyTEllVkRi?= =?utf-8?B?YW4vV255RHBtb3NJV0hsUGVrSmRjcVhySmxycHNBWlpoQnNyMThwNU9zb1FC?= =?utf-8?B?MWkvZURuaThCNlphWEtUMUdKQlFudFhkazVYWCtJSi9MN0FpWXB0Wm15N2px?= =?utf-8?B?anhlb3RmMFRTc28rd0QvNW90RzY5L01nVEJ2dGppcnVWYno2Z2c1eHcyREVl?= =?utf-8?B?MnEwVlBlUTV5N0V6Sm1uVVhucitFZk56K3lpMm1KdFdFU3ZFUkhQVkNVNmgz?= =?utf-8?B?Wm9uWG1ubm5XaVA0N0pnV0NWSFhIRVpRSE1xZjg5TXBtRGZzWTdsMGpGKzho?= =?utf-8?Q?1LkqQdsdiC2vE6GnRg3jPMLri?= X-MS-Exchange-CrossTenant-Network-Message-Id: d950e5ce-6b10-43eb-45d1-08dd83ae5206 X-MS-Exchange-CrossTenant-AuthSource: DS7PR11MB5966.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Apr 2025 04:04:48.1995 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KW6Zhwz3wHpgveI5yPAWXsM2014jTgxVM6t6Wi0o7UUs6QaQ1+R3OcvDK5i9w4SplzTNC+ErsmUOqGvuiQ9f8A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ1PR11MB6203 X-OriginatorOrg: intel.com X-Rspamd-Queue-Id: 8237B40006 X-Stat-Signature: yypxfip57jwoyutq7x549bkbh5oxhoi9 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1745553923-297991 X-HE-Meta: U2FsdGVkX18qc/r+l91c2EXzvFWP6ljyxvd2By9IWTKVag3mdYcqe2hcK6jwEfMoW1QVTOW1J+qzfRWrkyDBSuFTTgvrD3Q5YAQpCnCrIlk/QmyF0fpkG3Kp8wUdl/By8alfNmgwi7Kj01QshwZi6S9P4JFVArJOyYB6rIBj7chQWwg3Sie23lm5HxPVgbVQUXt5LxCvHPZHjVghNlnCyB23oH/PBl1eYzYa4LjoaIrC8VOWkB02M8vk6zjc8hvYak50TNHAw+9bgQV0aZu4A1wsQvqR7tRARxjLnP1+434ZOv5SAOxOHIGgHPY9fmuvvTPi/Ruk5JfvNDrl/OunLm/1Dg4KDwKgEFfdqfRu6nWfdI9tsq86wPBsjA1bJIGuUt+XAtTEvNT9Y7Vb6Q5Mic62wrPW6SjyBlK7L7WnCutGXQk+fGcjq0nqMyQFyE6wr9uS4LhLwrOPDPqeDy0Kl6EKw/Lg9u20vEDVcfwhjoJ0+6nHcjib4RIdIcSKCh2IEh8MBBEVlBcDizXHbd8rxmz1p92LEf4AxWtWYVDt/1H7WtqRelteNsXZmxbMcWsVN1j21MfskJ0681OOjaG/TJ5215WCuoOpN29UZJvfZLXO/y8BDOvbGOSoYoHDkXvLJA+xO/vAeTOufliz1lYuKvtGYwbAIB7S+tXG9bRsi0gsJFPCrHJrq850TNVP7LwtWEVW9dV9txS4gVia8suMpob6ZaI1aqWzt8CLZuJ0iIJOBV2EdvaKJOTVSZ/xL+ap3mXMzF8llu8CcJm9Uk54fCabegOMVlMjzHVJuNINt8AE4+YDhbKf7HqiLK/IjEwmf4/2X7AK1B9BA4pLSWqm25KlgnTVJDZlEVQj+FXCCvjyaJAeS5lptsGw+snE/3zS0GZIJR6IubG66wlqsjqkhZO6fogBvxLnzWEI5wT7VsoEg8StiDxtNB6nUGfm9L8rdA7MY219Z+9KQFDXXdt Wg+FgFfZ qn6B4sWif5SLAC7AiWrCFjrsAtCVXCmlZbw8KGezmIG0dPmmkuArJQgyk4auo0s4dKMJ8dIIpQZBs9i3YQuwoaQsscgAd4VfqQGdFfwgxdUc9yTXI+5PDDlM3bR6vuTJKzFRpyNfrxbPOBfaQ1MUn1oLc8aKSz6aDMJOSpHbhNE49w7AquBH+m98qYvlGcOF+YfXbBWejjVk4du3HL7qtxSJcfQ8diEPDtqhfMr6EAU5TtQ/IO49MwwETKm/sVfggzx/LbnV/s4kLeHWzBcg+bk9bzIiuomvXrqTq2dVTugG/lajNuswNzIMR+Y2ck0cNBXqbYoY2QwNd+ES2bx0dNH2qZnBeZ8sIQc79HovS3F6HV3y1zd9KuzHzmVsP0AUVAyK4HNHT31q/oTDx9hvMRIPZKS7bpT8hVel9aaCJp+RxT9/0vbbipYukltKG/hcVdtzcJV8Gk3jrONyln9RabMhPnXvTZwPfvO9KFHmsJKKjS1uhnKrlPCVYANLUf6R7WLZIE/7FMCQiae+nwYE5JEzrMM2eG0GhgHkou0vCmIBTyTEc6KKVt4hZptxYm6uMjkpaJ0IfeDm6HB+q7vyBtclQ254iDzrShUuHgjAVkH8Ka3Cv11f+ZVZkHZg/wcNLR+yRPdb3V/Ui9D0aPavnR3NXe14IwSToImVdj0ASY1H9x/4LlF9Ay89UieA50B/psz4CxuFyuHqnAIF8QF3/cpJPDrvD00L+T5SwTMr3UnFobU2H5S9kW20UD2u8+tnXEJ765XaTLEa2dgcRFQlwgVoyPwSFC0u3GdehdJeyV3LiSUnmASUUtTL46IT30M8bU+vCtO4itfK90cmoizQWi4gua2mWGbfpWVii X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 24, 2025 at 11:15:11AM -0700, Ackerley Tng wrote: > Vishal Annapurve writes: > > > On Thu, Apr 24, 2025 at 1:15 AM Yan Zhao wrote: > >> > >> On Thu, Apr 24, 2025 at 01:55:51PM +0800, Chenyi Qiang wrote: > >> > > >> > > >> > On 4/24/2025 12:25 PM, Yan Zhao wrote: > >> > > On Thu, Apr 24, 2025 at 09:09:22AM +0800, Yan Zhao wrote: > >> > >> On Wed, Apr 23, 2025 at 03:02:02PM -0700, Ackerley Tng wrote: > >> > >>> Yan Zhao writes: > >> > >>> > >> > >>>> On Tue, Sep 10, 2024 at 11:44:10PM +0000, Ackerley Tng wrote: > >> > >>>>> +/* > >> > >>>>> + * Allocates and then caches a folio in the filemap. Returns a folio with > >> > >>>>> + * refcount of 2: 1 after allocation, and 1 taken by the filemap. > >> > >>>>> + */ > >> > >>>>> +static struct folio *kvm_gmem_hugetlb_alloc_and_cache_folio(struct inode *inode, > >> > >>>>> + pgoff_t index) > >> > >>>>> +{ > >> > >>>>> + struct kvm_gmem_hugetlb *hgmem; > >> > >>>>> + pgoff_t aligned_index; > >> > >>>>> + struct folio *folio; > >> > >>>>> + int nr_pages; > >> > >>>>> + int ret; > >> > >>>>> + > >> > >>>>> + hgmem = kvm_gmem_hgmem(inode); > >> > >>>>> + folio = kvm_gmem_hugetlb_alloc_folio(hgmem->h, hgmem->spool); > >> > >>>>> + if (IS_ERR(folio)) > >> > >>>>> + return folio; > >> > >>>>> + > >> > >>>>> + nr_pages = 1UL << huge_page_order(hgmem->h); > >> > >>>>> + aligned_index = round_down(index, nr_pages); > >> > >>>> Maybe a gap here. > >> > >>>> > >> > >>>> When a guest_memfd is bound to a slot where slot->base_gfn is not aligned to > >> > >>>> 2M/1G and slot->gmem.pgoff is 0, even if an index is 2M/1G aligned, the > >> > >>>> corresponding GFN is not 2M/1G aligned. > >> > >>> > >> > >>> Thanks for looking into this. > >> > >>> > >> > >>> In 1G page support for guest_memfd, the offset and size are always > >> > >>> hugepage aligned to the hugepage size requested at guest_memfd creation > >> > >>> time, and it is true that when binding to a memslot, slot->base_gfn and > >> > >>> slot->npages may not be hugepage aligned. > >> > >>> > >> > >>>> > >> > >>>> However, TDX requires that private huge pages be 2M aligned in GFN. > >> > >>>> > >> > >>> > >> > >>> IIUC other factors also contribute to determining the mapping level in > >> > >>> the guest page tables, like lpage_info and .private_max_mapping_level() > >> > >>> in kvm_x86_ops. > >> > >>> > >> > >>> If slot->base_gfn and slot->npages are not hugepage aligned, lpage_info > >> > >>> will track that and not allow faulting into guest page tables at higher > >> > >>> granularity. > >> > >> > >> > >> lpage_info only checks the alignments of slot->base_gfn and > >> > >> slot->base_gfn + npages. e.g., > >> > >> > >> > >> if slot->base_gfn is 8K, npages is 8M, then for this slot, > >> > >> lpage_info[2M][0].disallow_lpage = 1, which is for GFN [4K, 2M+8K); > >> > >> lpage_info[2M][1].disallow_lpage = 0, which is for GFN [2M+8K, 4M+8K); > >> > >> lpage_info[2M][2].disallow_lpage = 0, which is for GFN [4M+8K, 6M+8K); > >> > >> lpage_info[2M][3].disallow_lpage = 1, which is for GFN [6M+8K, 8M+8K); > >> > > >> > Should it be? > >> > lpage_info[2M][0].disallow_lpage = 1, which is for GFN [8K, 2M); > >> > lpage_info[2M][1].disallow_lpage = 0, which is for GFN [2M, 4M); > >> > lpage_info[2M][2].disallow_lpage = 0, which is for GFN [4M, 6M); > >> > lpage_info[2M][3].disallow_lpage = 0, which is for GFN [6M, 8M); > >> > lpage_info[2M][4].disallow_lpage = 1, which is for GFN [8M, 8M+8K); > >> Right. Good catch. Thanks! > >> > >> Let me update the example as below: > >> slot->base_gfn is 2 (for GPA 8KB), npages 2000 (for a 8MB range) > >> > >> lpage_info[2M][0].disallow_lpage = 1, which is for GPA [8KB, 2MB); > >> lpage_info[2M][1].disallow_lpage = 0, which is for GPA [2MB, 4MB); > >> lpage_info[2M][2].disallow_lpage = 0, which is for GPA [4MB, 6MB); > >> lpage_info[2M][3].disallow_lpage = 0, which is for GPA [6MB, 8MB); > >> lpage_info[2M][4].disallow_lpage = 1, which is for GPA [8MB, 8MB+8KB); > >> > >> lpage_info indicates that a 2MB mapping is alllowed to cover GPA 4MB and GPA > >> 4MB+16KB. However, their aligned_index values lead guest_memfd to allocate two > >> 2MB folios, whose physical addresses may not be contiguous. > >> > >> Additionally, if the guest accesses two GPAs, e.g., GPA 2MB+8KB and GPA 4MB, > >> KVM could create two 2MB mappings to cover GPA ranges [2MB, 4MB), [4MB, 6MB). > >> However, guest_memfd just allocates the same 2MB folio for both faults. > >> > >> > >> > > >> > >> > >> > >> --------------------------------------------------------- > >> > >> | | | | | | | | | > >> > >> 8K 2M 2M+8K 4M 4M+8K 6M 6M+8K 8M 8M+8K > >> > >> > >> > >> For GFN 6M and GFN 6M+4K, as they both belong to lpage_info[2M][2], huge > >> > >> page is allowed. Also, they have the same aligned_index 2 in guest_memfd. > >> > >> So, guest_memfd allocates the same huge folio of 2M order for them. > >> > > Sorry, sent too fast this morning. The example is not right. The correct > >> > > one is: > >> > > > >> > > For GFN 4M and GFN 4M+16K, lpage_info indicates that 2M is allowed. So, > >> > > KVM will create a 2M mapping for them. > >> > > > >> > > However, in guest_memfd, GFN 4M and GFN 4M+16K do not correspond to the > >> > > same 2M folio and physical addresses may not be contiguous. > > > > Then during binding, guest memfd offset misalignment with hugepage > > should be same as gfn misalignment. i.e. > > > > (offset & ~huge_page_mask(h)) == ((slot->base_gfn << PAGE_SHIFT) & > > ~huge_page_mask(h)); > > > > For non guest_memfd backed scenarios, KVM allows slot gfn ranges that > > are not hugepage aligned, so guest_memfd should also be able to > > support non-hugepage aligned memslots. > > > > I drew up a picture [1] which hopefully clarifies this. > > Thanks for pointing this out, I understand better now and we will add an > extra constraint during memslot binding of guest_memfd to check that gfn > offsets within a hugepage must be guest_memfd offsets. I'm a bit confused. As "index = gfn - slot->base_gfn + slot->gmem.pgoff", do you mean you are going to force "slot->base_gfn == slot->gmem.pgoff" ? For some memory region, e.g., "pc.ram", it's divided into 2 parts: - one with offset 0, size 0x80000000(2G), positioned at GPA 0, which is below GPA 4G; - one with offset 0x80000000(2G), size 0x80000000(2G), positioned at GPA 0x100000000(4G), which is above GPA 4G. For the second part, its slot->base_gfn is 0x100000000, while slot->gmem.pgoff is 0x80000000. > Adding checks at binding time will allow hugepage-unaligned offsets (to > be at parity with non-guest_memfd backing memory) but still fix this > issue. > > lpage_info will make sure that ranges near the bounds will be > fragmented, but the hugepages in the middle will still be mappable as > hugepages. > > [1] https://lpc.events/event/18/contributions/1764/attachments/1409/3706/binding-must-have-same-alignment.svg