From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACC6BCFB44F for ; Mon, 7 Oct 2024 17:45:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22BAD6B0088; Mon, 7 Oct 2024 13:45:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1DB136B008C; Mon, 7 Oct 2024 13:45:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02D946B0092; Mon, 7 Oct 2024 13:45:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D00766B0088 for ; Mon, 7 Oct 2024 13:45:56 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 496EB160C47 for ; Mon, 7 Oct 2024 17:45:56 +0000 (UTC) X-FDA: 82647534312.15.DC87669 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf08.hostedemail.com (Postfix) with ESMTP id E07AD160014 for ; Mon, 7 Oct 2024 17:45:51 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Lq1nvugH; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b="WEeAf/3K"; spf=pass (imf08.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728323084; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KwHW8pJ6r/ThJ46fF3R88Itzymx3V3qKrrZD8VTxhF0=; b=sj3EC7r+eBv7Bc3cIwsplXaQEHP7x581yR1zBW4tEZNcaL9sgdnu986obx8ytYmqq+Nzb0 dxW8QmgtE3k8B+JscZLFMCzpAXmwHkE7tWTuPqr8fufZyutx4HZNLqcpQTQ3hRDTQMX8T9 IWHVCMUYBIORigfSjqh+xEb59LneS/Y= ARC-Authentication-Results: i=2; imf08.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-11-20 header.b=Lq1nvugH; dkim=pass header.d=oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b="WEeAf/3K"; spf=pass (imf08.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=reject) header.from=oracle.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1728323084; a=rsa-sha256; cv=pass; b=oB+r1iEnqm3FUMyL1ahPZWx0w6Fhw3GcshFxSUchZeV5kuX2gl/2w6vAvYktziy7O4k63n 1g9TJ9QChpbDYr+wyoKYj9tBwKMqBnhDT+cawSnjfNViN9uEWSAs946628ijPz8xJV8l2l iAoztLqm8IY5kqsbhwnd/jqZ/YPN9vE= Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 497FMbSq022249; Mon, 7 Oct 2024 17:45:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s= corp-2023-11-20; bh=KwHW8pJ6r/ThJ46fF3R88Itzymx3V3qKrrZD8VTxhF0=; b= Lq1nvugHBdyxcIMIud3Dp7YvmkIHedzB2IdQ7XVMyDhTHDuiB0BKz6O1b+B68viO aRcuAZTVis3jrLHNGpPKSX9eTF7DZyvFlf6ohCK7C9XwPpbyTOl2t1+MwoEeIhLz o93kjTMH7VJj5satyCM1PGvhSUv7VjXLRBSZkgQ46a8QoF4k4ViIQoF39mefkdtL J+vozsZbNhABv6EX3nWKkmRrQFGxP93i6NmMQTLyh6x5TH2PP/yQqplZ3dPK0w2/ QNRRMYIxVFyDyDAgBxDTyvpawz8zjtMRF0x0zh5kOg6socQJP7qFsAUuIzitJkhg pBmP1XmOZfXXazwMAtDr5Q== Received: from phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta02.appoci.oracle.com [147.154.114.232]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 423034m5ga-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Oct 2024 17:45:33 +0000 (GMT) Received: from pps.filterd (phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 497GeFKl001166; Mon, 7 Oct 2024 17:45:32 GMT Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2171.outbound.protection.outlook.com [104.47.55.171]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 422uw64jtp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 07 Oct 2024 17:45:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=t31SO1c/RvVWfF4VtsBkgWKTNaMcy5OE8FPzYATa0VENDxbFJ3LtAq6BXHVbY8k5vPU5yuPH6YTpj+m76mpSovoTsiAqbztVT84QjdSXcuWmCuGlCrEX4tCWxixYRgvTAO8bN3YVT/h6izusZSqEDMnttDrs9hFEVHMBh0+HRY3ejdFN5scR27z0bpjNSSR23mFhaAhudVYE89qFC2q2zyEgJPzhuVuAToIz8pFZru955beTLIerxChmynMrBy9tK+laWG184QY8oinYgQEXdhS59d+ESvAPuX3zi/0vMUJvlU0dzSd4QDj+Z5JTUFRRaBmmQ8OEZUA1rK23EJ4F6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KwHW8pJ6r/ThJ46fF3R88Itzymx3V3qKrrZD8VTxhF0=; b=sm3FxbnU64EyTD8fQa2TP2OQSSpOJmpWVOMNsAVXFEOHPIHbWJaefgFg4uwlv54fnR/5KR67GiJFgt3xElfeWeTw8FDyFq4opCc+fPwbELmfuNWM6b2AHL+YWLzupwX7nwek44gGWl/Xd12hyk+nTj9NXz39fyo0SNt1h1/5DP1xDbGuJ4hz7wzyrPfUrI9tzOXhf98NQZmllnLmff9uzbnfdJ+y9HbEQ7huCW+LQI0XgL0q2LXMBRIUzESG/ULcZxuhWn2b1bihE4rVOdq0BZC0lQLdEaM11+Lz9vxDr2qtPKeuOl/m/Uuv0YFyWimim2I9+BS3X6q172J4rXSLlA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KwHW8pJ6r/ThJ46fF3R88Itzymx3V3qKrrZD8VTxhF0=; b=WEeAf/3KvX/M4pRpd3tb5iSg4Rkbj33cVndE/cXvyBIEr3fOwZkI1zWUlsLkAezxFMpRoy4TQ/7xBmOte+ASu1QbmljaoJz6saNIUVVaWtKqC+IZQHLByjQidR6zc6K+4fBTfr8eKsCgMNgOZuDYVPSn7Rolen3nMd1B+KjU/6c= Received: from SJ2PR10MB7653.namprd10.prod.outlook.com (2603:10b6:a03:542::22) by LV3PR10MB7964.namprd10.prod.outlook.com (2603:10b6:408:215::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8026.22; Mon, 7 Oct 2024 17:45:28 +0000 Received: from SJ2PR10MB7653.namprd10.prod.outlook.com ([fe80::47d7:5812:ea42:38bb]) by SJ2PR10MB7653.namprd10.prod.outlook.com ([fe80::47d7:5812:ea42:38bb%5]) with mapi id 15.20.8026.020; Mon, 7 Oct 2024 17:45:28 +0000 Message-ID: Date: Mon, 7 Oct 2024 10:45:20 -0700 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3 08/10] mm/mshare: Add basic page table sharing support To: "Kirill A. Shutemov" Cc: akpm@linux-foundation.org, willy@infradead.org, markhemm@googlemail.com, viro@zeniv.linux.org.uk, david@redhat.com, khalid@kernel.org, andreyknvl@gmail.com, dave.hansen@intel.com, luto@kernel.org, brauner@kernel.org, arnd@arndb.de, ebiederm@xmission.com, catalin.marinas@arm.com, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhiramat@kernel.org, rostedt@goodmis.org, vasily.averin@linux.dev, xhao@linux.alibaba.com, pcc@google.com, neilb@suse.de, maz@kernel.org References: <20240903232241.43995-1-anthony.yznaga@oracle.com> <20240903232241.43995-9-anthony.yznaga@oracle.com> Content-Language: en-US From: Anthony Yznaga In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: BN1PR12CA0030.namprd12.prod.outlook.com (2603:10b6:408:e1::35) To SJ2PR10MB7653.namprd10.prod.outlook.com (2603:10b6:a03:542::22) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ2PR10MB7653:EE_|LV3PR10MB7964:EE_ X-MS-Office365-Filtering-Correlation-Id: cb1830f5-8aa5-4ceb-6b67-08dce6f7d4c6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?utf-8?B?MTlYMWhLYldYeXVDNUNCK3ZQTDBtNzZ1T0pJSkdOOVRxNnNEMmlYeHU1QUFN?= =?utf-8?B?S3JkN3A0a1ZPbTRySllZRGkvVmZTZ1NHcWF3UUFlNm9yV09pSUZwMUxWVlBZ?= =?utf-8?B?aXNrNDNUOXk4UGl4NE5hNUVuUEJTYjJueU9sWEE2bzUrdVB6YXM4RnFiTGFG?= =?utf-8?B?bUI0OVZrYmtrV1dFL21EOGFweTVaRFgwUC8wTUw2SkRVM3hFNUZNQnQyT2ds?= =?utf-8?B?TVJRb0NJQWZFU1FDNFcxamZDZ25PZ3pUeVA5YndHL2ZtTlNYRGlDczVhTmFX?= =?utf-8?B?Y1VIRU45N01NVVo1Q2l2azdUZDZrT05TMDBRTHVFQTRtWVJGMGxlSVNYN3Rq?= =?utf-8?B?UHpvMWxvaXNua0dHTmNab1ZlUUVpVkVOc1Fyc29UK2pDRHlkcnZaenl2T3c1?= =?utf-8?B?cjRNU2kwZ3pQVzlQV2tsb1l6bkdwd2ZENVFDMzFJQXpVcm8wVHpia2lodXFh?= =?utf-8?B?cTZtTnJjK1dQMzlUK1dnYXVlOFhiVjJlWG9sNkIzenNuU2FKdmRSaS9pNU00?= =?utf-8?B?cU5RRHV6Vi8wY29nNkZuRXJ4cnJ3c2Z6emVZOEdFS1ZuNXBzWCtsdXdkMWdE?= =?utf-8?B?KzhkQkpJUGlDWkYvU3MwUEY0TWIySG5uNC9LQ0VoUW5NaWZJVyszN3BTaVNs?= =?utf-8?B?UmQyRGdxVjJ1Ti9ob3BQU0VwMG83YzZGaE9lT1ozWC84LzJ3cUJZbndDdWFM?= =?utf-8?B?QmNwSzFVZU1qeDFXdlQ5QTdrV3IweHNLcW5GZE44YVJxR2VsWmxrRGVVaSts?= =?utf-8?B?enBUam9LZFh3VHFpMGp4UGpxM2c3SjNuVHRZUUo1enJxM2I0SmlPVGZsNnly?= =?utf-8?B?cHJFV29EbzJRVTBFY2lNRkpKcklWSUZrVmVsQ09jVlVybG9RWEx3VTN5Z3F4?= =?utf-8?B?Zmd1Um9hdmJnaWtRRFNEbWhqazYxcnAxSHY5NWl5RGcyTXVmc2tWT1NCM3pE?= =?utf-8?B?Z0dzbTcxL0QyYmdVaTA0cGZDOGt1bXJhUjZ5cWNndDh0WkIrWXRaU0dLRG1Z?= =?utf-8?B?UE01YytzN1V0R0JIZ1M0VUhrSEdXL0l0U2RySm5oNHFjdTl0bGlIZ3lKclJU?= =?utf-8?B?NGduV3FpaDVPNXFZckM1WkhGU1hMK1ZPV2t3bjNjOGdJaEFTZzhKMkRPeHpP?= =?utf-8?B?UHFDSVpwZXd0dDBVR29rQ1dnazZzMVA1aTBXaS9QSkhqaTk0YjJRTXRySnZJ?= =?utf-8?B?K3paZWo1alVTL1pVcGpIK0xaTUcyamtpcXFaMHJ6UExpb2paVm1lcGdDVXBq?= =?utf-8?B?YUlTT1NwUnNsZU43bEx3NGs0cnZ3ek1Zd0lUVDBPOGt6cHNWWHVBR0xJRGIv?= =?utf-8?B?M2lwV2dFbEJySzhVazdKdExqWVBOU0RJUU5GbU52T1lKY0JmK29yMlcrWSs0?= =?utf-8?B?dVd1Wk5yTE5qbzZqVXV5SnptZStuWWFMSEZxOEc5b3FGVXprUVRkQWx4VGhm?= =?utf-8?B?VE9LRjd1cmlEVkZWeXFybzIwYjd0SVhTbGV1dDJucnpKa2Y2MTJNYXdsYkMy?= =?utf-8?B?Rm9IMWtPbEhMTGRWM21kZHh1R0NnSGN4c0Y5aDRuaXJjcWZ3WHhBWlZ5Q1ZD?= =?utf-8?B?aFpTSmluT1JrS2xsU2pYejlBVWRsVjN2VzFORko2NVNFMmVIbVVUVzJCMTdM?= =?utf-8?B?bE4xMmEvbHdZbnhsU1o0OHBzRmJOZEo4Y1ArK2JuY0QzeGJmRGRnc2wvelgv?= =?utf-8?B?cHgvbmhjSDRadlJubUJPTUJxZXU1c3MwN2crZXpWZ0FwMVhVTENlMFdRPT0=?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ2PR10MB7653.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(7416014)(376014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aUxlcHhDL1dadWxtU2xvUm9vVFFkOXVhN1lFUDVHVys4bWJzQUhkMnFPaHB6?= =?utf-8?B?OENoVElHSVdLbloyeEpyVHN0bGVFeFZDcVBML0xNMlFQZENtUzFNbncwOGZJ?= =?utf-8?B?bHl2Yk5ja2xDMkJNY2g0WjZSZTNBcnM2RlZOb24yZHZOVWJ2UlF3MHFGbXB0?= =?utf-8?B?cVJwY0FpUnJySjFuRkhobDBCQWdGYlorclhRam41QVpiZGU2a2NGMDRwaDFL?= =?utf-8?B?VmI5d0R3QlZUVFo4dk5lQ2xZNGtRaTJXR1VFYzF6TXFBVzZiZ050R3k0dXhY?= =?utf-8?B?dmwvMTZiTW55ZGIyMURlU1pXQmFQNmYzNFBJaHcyZmJQYnFnL3lLQkNsOG83?= =?utf-8?B?eUl6M2JnYXU5WmVDcUtVRDU3a0hHbHYyRmhKanBaR3FXZlVMOTVRMThzQ0tU?= =?utf-8?B?RGxRVnNxMER1L1o1Rlh2Vnd1dVM4L3EvU1YyTWpkNlIrRTh6ZHpuM2lhTU9j?= =?utf-8?B?MkpKUWgzclZsSWRjU1kxYThrZTEyRCtyb01pZTVCVkdvai84bnArUUF2VWRV?= =?utf-8?B?a0NrZWdLaWhVdmljL3hkeENML29EYTlpZTMrWmc3U1ZsT3NOS29ZZnE3MjJm?= =?utf-8?B?V3Z1MVVnMVE3RCtYbTY5QzNJektlQXhmd0RTU21HOGk1ZWZRd1NjNzBBMHIy?= =?utf-8?B?d1ZHVXFjd0hOc0xQNURONjMyTnIyN0V6V0NQODUxbDcxbVhLOG95aFFCY0pj?= =?utf-8?B?T05tU2tBK0ZxcnNmdWtzajdsMHF4OVZBT1BoNGh6SHNQUUJtWWo5NG5veG9G?= =?utf-8?B?L3FnVVNJR3ZYWmoyTEpOT0VSQjc3L3RwVkVyOFlBNitHdlRycWpqWDY4dWRo?= =?utf-8?B?YkY4bEdveDA0aW5UV1hTK1UxWXo4MmI2c3d2S3FKMHV2Vk1OMG9iWWRJUURq?= =?utf-8?B?bzdtNXVGb0VIVFg0MTFFd0Y4VnAwOVIwZ3VrY3lYckJVSjNoaGZIRFFLMElD?= =?utf-8?B?ZCtXT1VMM0FPYnV5SkdpN1pmODVRbXljdWtIRWVNK0ZhVG5CSU1GZmhhSm9D?= =?utf-8?B?S3ZPdk1lVUttd09ZaXgyR1lLQVZxK3pjRFZBN0pnb3V6R0VwZ1NzWkFBL2FO?= =?utf-8?B?ajR3b255d3hxQ0FaSUU1SDBGTWgyRXFHYWtCVVVGblhqZHl5cUFpaXhEYVJT?= =?utf-8?B?d2dpdHZTclVNWDk1SkZCQ3Y4a25aNDlyT1RXQURTNkdDblFkOGdLeW9udHpO?= =?utf-8?B?WWN5eGY1UUNlS2p1RzZYQzBaWU1yWVhPdHFST3hwYUJ2TXNPL0tUcjFHQ0hS?= =?utf-8?B?UU00WnVXcHNXU0szTzl4QTkxNDliWDFhc2lKT1BjcDc4WlBncEpFVG90cEF6?= =?utf-8?B?SUJpWHA4eFQxaklHdGd1SDM2S2VteFg1d1NDcTlISlV4TWM3Z0VMajhQZ2RP?= =?utf-8?B?WnlWWC9aQ0RIYzZFekJNektuVnhZOU5OK1pVSHYrcVowU1RjdnVVVDBZSWVO?= =?utf-8?B?U2NUczBqaytrbzJhcUt4R2RYWTVoeWRJbDFObEpQZ056WWNtZ05QQ09UbkhJ?= =?utf-8?B?aUZpME9Vam5walVxL3ZKTFRzcHBRdktyZXh6V1QwdzdncFJza1BFSk5JRjBP?= =?utf-8?B?cGl2WnhGOFZ0VjM5R1NJUExFaXRHdFNTRGdrUVg0TW1ERkU0ZU5KSnFVZFN3?= =?utf-8?B?WTh6a0FUU2FVbmVNMkwydGdodU5Ib2YxY0t1K0JhQ3h3OXNZL3I2N1NiZWo1?= =?utf-8?B?M3F1UU9nMThQYmM2emVNZzcyVFlVclpObmtDT3RVRUdKNFNQOGxkaHh2YzVj?= =?utf-8?B?MXAyUVp5eStUQlVTVFV4LzlhNzBUMTZuVWFDamVyQTBIbFB6S0VOcjYrRDNl?= =?utf-8?B?MktwOHVYZ0hoeXB0bXpMMDFkTzQ1ejRSNGtVZVdYR0pVUk9IVWUwY0htZEhC?= =?utf-8?B?UWkxSjRadHBDcmplVGhiVS9PQytoVDRvWUNkRzFuVE5sVXVFUG10ejVUV1pT?= =?utf-8?B?aWxpTVprODVFU09VeHBBd253S2JIcGRzRm92b2VkTWpGaWhvOWQyb1lscWlp?= =?utf-8?B?cjRrVytvU1pqSWtMM3Q5N3hDcmdzV1U2OTdHQnE3RUVReUQzMk5wMHA5SG9C?= =?utf-8?B?WGI5SGpCN01uQm9NNXFZWHFmSnhoWnA5Yk1RRlBjSTZnTVExN2czZW9rcU5M?= =?utf-8?B?cVJXTnFyb1RiWmxlampxdm14cW5oMkF4RDVNMWxKL2FsNTZ4NGNlbWQ3WCtQ?= =?utf-8?B?Q1E9PQ==?= X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: b9ujjEZ0rwn0hxKhMn+il3WuzgYKSprsthS5xvOPL4/2zTI4kpHix1M5Ee47C47xMgfjQN2kNnFdL8K3ZGOPlR8qJGIuRtMQaL/5IneRKwUcXDlDSzC4AJvAP5oWWyUos2yVr67ZELoCjSAxNMBskxitLbQdTP15TSz70LCD95wytQDKy59VAfeRO3puyrvxjWKLqWhThkogro/erJ9mBiZCmLmyomiQD0y2puh9qvjws9V10aW+AmxFK0ktahIOG9HO0eBAoFSki/jqApmWNskNVq7DSbEwAuGTmlTfJbMCuUZBG/a5YR53IfFFqL7u5r4CRLp/0ktelF1/Pe5qJlFqhpwmmC6+EIh/tNHLMlifHzWeCZBq/ax+fqnMwRT0jnlAc0CLdMaDiXj+LDg4m8l69wOtq9xnFlCU6ujaygSDsZe69Xr4pE+PdoqoZj7uQ2wUvOft7SmjuYebkJQgs858BY98DH1xidw3HKSLyrp7yricBI7mI7wuD1DGdR+n+bMmTn9nUX2uvsDCBuq3N5Ib3VH1uJA36SIIzVZRnNcIW9Ev4G6pZO9qagNvWSZh+wOlJU2V6yltJILsFo7l1bbHyHitMgn+NBpCHjBBIcM= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: cb1830f5-8aa5-4ceb-6b67-08dce6f7d4c6 X-MS-Exchange-CrossTenant-AuthSource: SJ2PR10MB7653.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Oct 2024 17:45:28.2643 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: SivdN8gNo3T0KxNmTOYTH9ed/gKgGUhgNsOJ7Sn7MBecug1URAV6j7Jd40laCHx82J3tCqw1XichvJDo6i+xh49heI14DMjTUARKKTrw8hQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV3PR10MB7964 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-07_10,2024-10-07_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 malwarescore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 adultscore=0 phishscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2409260000 definitions=main-2410070123 X-Proofpoint-GUID: hYbF6AVnBVmE58-inyC0YuXDJG1yf26f X-Proofpoint-ORIG-GUID: hYbF6AVnBVmE58-inyC0YuXDJG1yf26f X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: E07AD160014 X-Stat-Signature: q19f8nurm8m3z1pr6e5xh4ig9sxpazp5 X-HE-Tag: 1728323151-307060 X-HE-Meta: U2FsdGVkX18EjHn6cseQrKVI9Xt5ypaqD/qVS5wW2xCKwIiI2CvYnSu0JdleF617W2Q7AwCDXdjEbs34x8dpnvu6dxpJDX5fbJj4T62VewY1pdR/HX3GRI2EgNNi0i5ZeGxJ9RhoxrjoT/QXzxd6dOhgT73qgcdP5AVQwBMvcINRJMejiiEzJS/9c9YUIL3sGZMjY9h4UlOSqAz2qGjdTAqjl8LPEod6+p78xMuc6UARby9U2EDENwoFHTMykXj5wVR2aIt3spexaVB3mcKTUmZ/KOOAFOV1ScwD3gR/442zCrcUrnpcay3uWkYRrD9efAnvQqIItFNL8d2myFmuNMxlWjta02avrixNp/7CS+QqVhHsApD9sXjULFNPJtmGEJsbHF11mxPWt5QEsCpotJVCIXlja/SXkyNKx0O2mj3KXziEiqFZbCGwMvSrqbOEdwP40W/dMSEFpaNNWBuf+4eaUdIX48qC8lipP3CZ9liM2wKeERt3PJ2sBJnS+3/pYGg6eHo6qFqZcw8SsPgdLAJFKEjGoZtKHunaSmIwhN6/20mcpLt71z822LgxDCGlBw7SXKFKOvVU5RIIJ0MIbWwSt5UXd8s0NWyWZVvRNbu7fBGqoJteagDhawUfpGcHKODeoFmmFt5ipiwEkAp9hSYEgvhAiq+Gc6aJwogVKyLcF2ezumyK+qGTYJ32hCizhG4RezY3xmyFBf1QPukZoHK8KWWXw3TsTuhaze3uP5HTGI8NiwqDURbogQw9gDc9wxT4bckVl6AwGmfrVS/JCJBpn5jaHkQ+67lcnK94IH6tcrK5WkP1A9FxoET01OZtNVmYBbsIW8pyRM+2oPQLiqIDIz1xps0IYJGz0N5cCMKmIcPAzk3lyjbr13QJcWuCgIDkvqOcLBo6/S8krjpo40WmxRV488inOdQgenez6kVV0ZACn50SQWuQTK2TgzAZqvJiGPle9/4QJFI1jPM wt5uF2AB e7Vl8CqoQOLuFm953/whp/oB653OkZfwaIDo1WCReF2QkcvltGi67wmrJKrAHRUifBGRPBKfR3+7L2VLGVGHRJu51XjqdA+ovZWHyPzKDhXWdyanABsU7+TtecFcrhZo8bbH2bj8dyO4cRUXByqk7yO6uWra3HELk0Kjm0HBUE2cKX+S0G2+7GgUZQrsxVJ9eFfgGJp05ib/t32q1f+u2gKgN8r2jn8D7gm4yTpTy7M6Q4Qi3sZSxhc4hSUi3MTdDcveL2jn72ApVvb0yifjgi08DVIRANmgjLuuB8eE1fzcxT2IN2L+ribZGgCmStwHcrMhx4xK3SJXKb/ARSeHCPEHRYPst5AvRyqTtZd3eJrAELFqvOGj9eQMN5db0+QZ4b5595Hxo5oqvS0I95I7XuCU9h8Cdp7uIF06XNed0P+5P2r1crleht2qNtqfwLGvB61WEFnTGrjtBOKvLATJpbIzMPLHuXwCHSFCvTV7AVPq2hygHwx8jZSYklHlV2/t2mUAXTsryAsYmZK87Mw533RhmevhaJyOr3kTiN/cTO3xZ/hPFWCWivEopGXuka/C+odc1ru7zFFm4mpKogM+FHZfY/BV3pKgFMuNDQp6RbbJzLqIIRKM9RiXfw4Wt/PvM9by63ktY8u0ob4I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/7/24 1:41 AM, Kirill A. Shutemov wrote: > On Tue, Sep 03, 2024 at 04:22:39PM -0700, Anthony Yznaga wrote: >> From: Khalid Aziz >> >> Add support for handling page faults in an mshare region by >> redirecting the faults to operate on the mshare mm_struct and >> vmas contained in it and to link the page tables of the faulting >> process with the shared page tables in the mshare mm. >> Modify the unmap path to ensure that page tables in mshare regions >> are kept intact when a process exits. Note that they are also >> kept intact and not unlinked from a process when an mshare region >> is explicitly unmapped which is bug to be addressed. >> >> Signed-off-by: Khalid Aziz >> Signed-off-by: Matthew Wilcox (Oracle) >> Signed-off-by: Anthony Yznaga >> --- >> mm/internal.h | 1 + >> mm/memory.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++--- >> mm/mshare.c | 38 +++++++++++++++++++++++++++++++ >> 3 files changed, 98 insertions(+), 3 deletions(-) >> >> diff --git a/mm/internal.h b/mm/internal.h >> index 8005d5956b6e..8ac224d96806 100644 >> --- a/mm/internal.h >> +++ b/mm/internal.h >> @@ -1578,6 +1578,7 @@ void unlink_file_vma_batch_init(struct unlink_vma_file_batch *); >> void unlink_file_vma_batch_add(struct unlink_vma_file_batch *, struct vm_area_struct *); >> void unlink_file_vma_batch_final(struct unlink_vma_file_batch *); >> >> +extern vm_fault_t find_shared_vma(struct vm_area_struct **vma, unsigned long *addrp); >> static inline bool vma_is_shared(const struct vm_area_struct *vma) >> { >> return VM_SHARED_PT && (vma->vm_flags & VM_SHARED_PT); >> diff --git a/mm/memory.c b/mm/memory.c >> index 3c01d68065be..f526aef71a61 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -387,11 +387,15 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, >> vma_start_write(vma); >> unlink_anon_vmas(vma); >> >> + /* >> + * There is no page table to be freed for vmas that >> + * are mapped in mshare regions >> + */ >> if (is_vm_hugetlb_page(vma)) { >> unlink_file_vma(vma); >> hugetlb_free_pgd_range(tlb, addr, vma->vm_end, >> floor, next ? next->vm_start : ceiling); >> - } else { >> + } else if (!vma_is_shared(vma)) { >> unlink_file_vma_batch_init(&vb); >> unlink_file_vma_batch_add(&vb, vma); >> >> @@ -399,7 +403,8 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, >> * Optimization: gather nearby vmas into one call down >> */ >> while (next && next->vm_start <= vma->vm_end + PMD_SIZE >> - && !is_vm_hugetlb_page(next)) { >> + && !is_vm_hugetlb_page(next) >> + && !vma_is_shared(next)) { >> vma = next; >> next = mas_find(mas, ceiling - 1); >> if (unlikely(xa_is_zero(next))) >> @@ -412,7 +417,9 @@ void free_pgtables(struct mmu_gather *tlb, struct ma_state *mas, >> unlink_file_vma_batch_final(&vb); >> free_pgd_range(tlb, addr, vma->vm_end, >> floor, next ? next->vm_start : ceiling); >> - } >> + } else >> + unlink_file_vma(vma); >> + >> vma = next; > I would rather have vma->vm_ops->free_pgtables() hook that would be defined > to non-NULL for mshared and hugetlb VMAs That's a good idea. I'll do that. > >> } while (vma); >> } >> @@ -1797,6 +1804,13 @@ void unmap_page_range(struct mmu_gather *tlb, >> pgd_t *pgd; >> unsigned long next; >> >> + /* >> + * No need to unmap vmas that share page table through >> + * mshare region >> + */ >> + if (vma_is_shared(vma)) >> + return; >> + > Ditto. vma->vm_ops->unmap_page_range(). Okay, I can do that here, too. > >> BUG_ON(addr >= end); >> tlb_start_vma(tlb, vma); >> pgd = pgd_offset(vma->vm_mm, addr); >> @@ -5801,6 +5815,7 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, >> struct mm_struct *mm = vma->vm_mm; >> vm_fault_t ret; >> bool is_droppable; >> + bool shared = false; >> >> __set_current_state(TASK_RUNNING); >> >> @@ -5808,6 +5823,21 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, >> if (ret) >> goto out; >> >> + if (unlikely(vma_is_shared(vma))) { >> + /* XXX mshare does not support per-VMA locks yet so fallback to mm lock */ >> + if (flags & FAULT_FLAG_VMA_LOCK) { >> + vma_end_read(vma); >> + return VM_FAULT_RETRY; >> + } >> + >> + ret = find_shared_vma(&vma, &address); >> + if (ret) >> + return ret; >> + if (!vma) >> + return VM_FAULT_SIGSEGV; >> + shared = true; > Do we need to update 'mm' variable here? > > It is going to be used to account the fault below. Not sure which mm has > to account such faults. The accounting won't work right for memcg accounting, and there's a bug here. The mshare mm is allocated via mm_alloc() which will initialize mm->owner to current. As long as that task is around, count_memcg_event_mm() will go through it to get a memcg to account to. But if the task has exited, the mshare mm owner is no longer valid but will still be used. I will just clear the owner for now. And a quick note on calling find_shared_vma() here. I found I needed to move the call earlier to do_user_addr_fault() because there are permission checks there that check vma flags, and they need to be done against the vmas in the mshare mm. > >> + } >> + >> if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE, >> flags & FAULT_FLAG_INSTRUCTION, >> flags & FAULT_FLAG_REMOTE)) { >> @@ -5843,6 +5873,32 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, >> if (is_droppable) >> ret &= ~VM_FAULT_OOM; >> >> + /* >> + * Release the read lock on the shared mm of a shared VMA unless >> + * unless the lock has already been released. >> + * The mmap lock will already have been released if __handle_mm_fault >> + * returns VM_FAULT_COMPLETED or if it returns VM_FAULT_RETRY and >> + * the flags FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT are >> + * _not_ both set. >> + * If the lock was released earlier, release the lock on the task's >> + * mm now to keep lock state consistent. >> + */ >> + if (shared) { >> + int release_mmlock = 1; >> + >> + if ((ret & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)) == 0) { >> + mmap_read_unlock(vma->vm_mm); >> + release_mmlock = 0; >> + } else if ((flags & FAULT_FLAG_ALLOW_RETRY) && >> + (flags & FAULT_FLAG_RETRY_NOWAIT)) { >> + mmap_read_unlock(vma->vm_mm); >> + release_mmlock = 0; >> + } >> + >> + if (release_mmlock) >> + mmap_read_unlock(mm); >> + } >> + >> if (flags & FAULT_FLAG_USER) { >> mem_cgroup_exit_user_fault(); >> /* >> diff --git a/mm/mshare.c b/mm/mshare.c >> index f3f6ed9c3761..8f47c8d6e6a4 100644 >> --- a/mm/mshare.c >> +++ b/mm/mshare.c >> @@ -19,6 +19,7 @@ >> #include >> #include >> #include >> +#include "internal.h" >> >> struct mshare_data { >> struct mm_struct *mm; >> @@ -33,6 +34,43 @@ struct msharefs_info { >> static const struct inode_operations msharefs_dir_inode_ops; >> static const struct inode_operations msharefs_file_inode_ops; >> >> +/* Returns holding the host mm's lock for read. Caller must release. */ >> +vm_fault_t >> +find_shared_vma(struct vm_area_struct **vmap, unsigned long *addrp) >> +{ >> + struct vm_area_struct *vma, *guest = *vmap; >> + struct mshare_data *m_data = guest->vm_private_data; >> + struct mm_struct *host_mm = m_data->mm; >> + unsigned long host_addr; >> + pgd_t *pgd, *guest_pgd; >> + >> + mmap_read_lock(host_mm); > Hm. So we have current->mm locked here, right? So this is nested mmap > lock. Have you tested it under lockdep? I expected it to complain. Yes, it complains. I have patches to introduce and use mmap_read_lock_nested(). Thanks you for the feedback. Anthony > >> + host_addr = *addrp - guest->vm_start + host_mm->mmap_base; >> + pgd = pgd_offset(host_mm, host_addr); >> + guest_pgd = pgd_offset(guest->vm_mm, *addrp); >> + if (!pgd_same(*guest_pgd, *pgd)) { >> + set_pgd(guest_pgd, *pgd); >> + mmap_read_unlock(host_mm); >> + return VM_FAULT_NOPAGE; >> + } >> + >> + *addrp = host_addr; >> + vma = find_vma(host_mm, host_addr); >> + >> + /* XXX: expand stack? */ >> + if (vma && vma->vm_start > host_addr) >> + vma = NULL; >> + >> + *vmap = vma; >> + >> + /* >> + * release host mm lock unless a matching vma is found >> + */ >> + if (!vma) >> + mmap_read_unlock(host_mm); >> + return 0; >> +} >> + >> /* >> * Disallow partial unmaps of an mshare region for now. Unmapping at >> * boundaries aligned to the level page tables are shared at could >> -- >> 2.43.5 >>