From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AB4AC0015E for ; Fri, 4 Aug 2023 00:20:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86D6F2802A8; Thu, 3 Aug 2023 20:20:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 81DE228022C; Thu, 3 Aug 2023 20:20:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 697202802A8; Thu, 3 Aug 2023 20:20:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5906C28022C for ; Thu, 3 Aug 2023 20:20:43 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2FAE81A1312 for ; Fri, 4 Aug 2023 00:20:43 +0000 (UTC) X-FDA: 81084516366.28.4B83CAD Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) by imf19.hostedemail.com (Postfix) with ESMTP id 615CE1A000D for ; Fri, 4 Aug 2023 00:20:40 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=E9WuZ01u; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf19.hostedemail.com: domain of apopple@nvidia.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691108440; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I8mPJuURVIwf2O2gklELyfSKX+n38151IhyCcBaeEmw=; b=F9w+Jw0BVieBPwVwlyTOwgT4zvjiM20hGf148SSkkXX2AC471b6A+EIfzgjwqMlJHrMcn+ 7RJATTDnEg6fSxYkdek5Su2IXrN+3Ug36r1FRcZ3yKsm3uJqn5cDftU6lOsm/apSRLtDA7 9kUm6EECTF0Sc/rr85oj/f0MGZq4D/A= ARC-Authentication-Results: i=2; imf19.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=E9WuZ01u; dmarc=pass (policy=reject) header.from=nvidia.com; spf=pass (imf19.hostedemail.com: domain of apopple@nvidia.com designates 40.107.236.41 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1691108440; a=rsa-sha256; cv=pass; b=fp2t8DoLxw9MDDsoENpd9K7HKiXVCVUy9fkInGANcfqHJdXp/WPsR9UYA0UC0h/6bt6m38 kDEto0bHU5NqDAniTVgNJHzumJIp00b0PLJ25tqiquI8DiHlM4KCToz5IFLXiFBVojTwOb bDukWzqtrnCYTLjofP1n7+dZguxbXtI= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OsrSfusBoV15tkkGzRDAZATLtsREYn/XHjykCIf9ApgPuqk9fJZ2Uy8FJ9zkRrxbD3XlNap28KSgBRHNPga1dkBynPGCCTUevFSvxcLjXZCl35vxCkpm5ACQ9Ib+M3BetDC49/0RHMLu8A5YP7lItrFhgUlyxIXg7u/IPSI5ZADSaRU2M5fOEqbioGnTfqFaGxNqLSlLK6+OIPIBiBcurefnXLGEDmMDaSYLx40kAXNuiI/ZLMM0M7TyGuXlVXIuAgSYoKmxKqQXs1W5mZNRZ6+ncnxxGYafJrLZUClLOboKG/n83QQnqKYQj9dwjPDAe0r5qUO2b9ev0QhbmlD2Ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=I8mPJuURVIwf2O2gklELyfSKX+n38151IhyCcBaeEmw=; b=BxgNAwcQeDTpVakMJIjFLfsg4WblPA5M44aqdcWPfm+N56MVAhWgmqs4ABqTM7Wid2ZfgsTkINnDhvJ7vQcQejYmtdHfB0uQg5KLN7mlwZyL/j76Sdv/r3pyBE46qsFQSh5/fXkkI6Ca6dlx8DjuyIiG5bSqYVMda1nkFsN/5dRLlfUsvpB9kuEuKEW50Z7ApMTXeucumSnpoNUcn5+7XiRHegvmIDNLmBQrupNkQfBnE8gdAIzgdlnKMxv3oOVZpDNSyi/3BxcVVynRVZ6IphUazNV/uxmdeG44MhLDYQ+gj2gCasbFP2cNQ7WoHEexFBqfStaoPXtOC02iBLjfQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=I8mPJuURVIwf2O2gklELyfSKX+n38151IhyCcBaeEmw=; b=E9WuZ01uS1sCuH4T4EOSOvdmr5ZY1S278ienYV5/EsYpU8Qboz7/c26zepuDNYuyzP+uPRj4uryDouLnZElVGDmORCE9rTLt9o98Ogr2K7qc0HOLVWSiQ4B/yOh9kV37qhFqRi0PYQyBLvwZV4h3ynRXC1y9oHso0elhuXATVUuatWqdRb854gZT4sNS/E8ynxPwZqymKlX/qPdMLu2O8Y0E535H4lQY8rraSyU/Dro00iXJxUfBb4oqcmeWXbhM+x+WQ1FgKa9jZo/1Jsfwybf2xQ/zo9JxSkWFjUWAvUIGZhUX54LarXJsslsyAmBj4pEskjiZCchJnbQJH8E92w== Received: from BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) by MN2PR12MB4567.namprd12.prod.outlook.com (2603:10b6:208:263::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.47; Fri, 4 Aug 2023 00:20:37 +0000 Received: from BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::6edb:265e:a2a2:eb04]) by BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::6edb:265e:a2a2:eb04%3]) with mapi id 15.20.6652.020; Fri, 4 Aug 2023 00:20:37 +0000 References: <48f22686-2c1b-fd9d-91ba-da6105d410db@redhat.com> <3427735b-2a73-2df7-ebd9-0d1066a55771@redhat.com> <2aee6681-f756-9ace-74d8-2f1e1e7b3ae6@redhat.com> User-agent: mu4e 1.8.13; emacs 28.2 From: Alistair Popple To: David Hildenbrand Cc: Jason Gunthorpe , "Kasireddy, Vivek" , "Kim, Dongwon" , "Chang, Junxiao" , "dri-devel@lists.freedesktop.org" , Hugh Dickins , Peter Xu , "linux-mm@kvack.org" , Gerd Hoffmann , Mike Kravetz Subject: Re: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages) Date: Fri, 04 Aug 2023 10:14:59 +1000 In-reply-to: <2aee6681-f756-9ace-74d8-2f1e1e7b3ae6@redhat.com> Message-ID: <87cz0364kx.fsf@nvdebian.thelocal> Content-Type: text/plain X-ClientProxiedBy: SY6PR01CA0155.ausprd01.prod.outlook.com (2603:10c6:10:1ba::9) To BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BYAPR12MB3176:EE_|MN2PR12MB4567:EE_ X-MS-Office365-Filtering-Correlation-Id: c7c5b595-3216-4fa1-fc88-08db9480a081 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: y0TvfGEwlFoS59Jb4hHd5jE/sHaslq6bVE8fHB3hATrJwocqrrMhvcvvQPwTBKfe8ZscxemD3YKaQU8zNQeGcu61Wk+xs737B5UjrFVGyp4JIO1FIV/dBHWqIBnLk9fR+DPUbRJxu64Q56qGO3kVH3NlTKc8nUHwm8S1L+OItoY+3sHl/97I7An/sR2P9XkYWI74TSr1UhETqAatMjHrH/w9itSSJdhDDY+TElTBqp89oyPyLm7Ai7qRUiAQ89VJpLG4i9uQLfmK04vbfyueIJq4DQErQAuUfeE5a0l0c7fPMl+vxuWDF1NsDq2UMVAz8DfuNq8jjLbb+fxWWX+qRvQVidUaqqm3VFiff2Xx+7Bwx6rM8ozl6+7oyIDZ+GQQEN7ENLlCBK+fSrJbezTP5iQhTaINTUdYwRezPlJZddQaiD3ZKuRNn+8ZPlM3IThPgBmmVACk3WSwWvaae6s39n0gqRrmw6inLJZGNktYY9Q5F3d07r01nlRvBaB8lJZTOBq4wKQPHxz6a1fThlhfiP/4/2suX3kmbpnMPpIV9btHCxu2KlmvOFWnDksUwYFabbYkBtlQDQH5HHBtxUuNPJW7B1LbQh6VvUd+gPKk03o= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR12MB3176.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(366004)(136003)(396003)(39860400002)(376002)(346002)(186006)(1800799003)(451199021)(83380400001)(53546011)(6506007)(26005)(6916009)(4326008)(66476007)(66556008)(66946007)(2906002)(316002)(7416002)(5660300002)(8936002)(8676002)(15650500001)(9686003)(41300700001)(6666004)(6486002)(966005)(6512007)(54906003)(478600001)(38100700002)(86362001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?U8UWb4KpnceqaAdIzw1wCjKOZl2zUv3kk+TPTMyGpuT+w7ar+EZk1b26xrg8?= =?us-ascii?Q?ZbLdQU2Sv6XDmdZvElR4cvBzpzGzXfnESnhwB7+Dp9TeGzzSrteoCLu2eqPi?= =?us-ascii?Q?y8MeAhxZc3oGDI1azX4s2DEkqlekyGp8pVR8vtsxXkZdFHa72pXm9VbLRaAa?= =?us-ascii?Q?llozwhbcwrTClRFnqBMYhmPC9iyK4a7hpSFAt/FHPLX0yLcQi0ajT8Cc/KVg?= =?us-ascii?Q?s5dXKYUEWTkGKB+aBHwPIha6JDRCL+gbiGHoaL7JGxl0UeB6Ycgc6TonDfGI?= =?us-ascii?Q?d4F9mtP18z2W5sTnaIRi49Dmyflfuz9pTqmZ8wwMVRU4YHQ2efroCKIrjybW?= =?us-ascii?Q?rQ8E304rq2eQP+sm/y1ULVeEpkjk4aJYdl3BWQeGEfbXJy+9zh0F0BwwDVom?= =?us-ascii?Q?wBXpaPgYGKioBU6V8zmIDQMqa2x5POtHMuWaNUzmCuUo17lEw8pe6QPAKB9M?= =?us-ascii?Q?ueHG1OJAIKJZhq3TZgzlpPwzlbeHUfp2RDaEaai+lT358PEFfKRG/H1g6PU1?= =?us-ascii?Q?TSxd2qehYgPzUHf1O9DU4fDmRiNZGBaVR0rlj9O2UZ6PiN/w5rKMR/v9kAQq?= =?us-ascii?Q?NzDLfFm1qZnP9mImzQS4/Kk4A+d1HBU6THemRrmZLoJ6B36n3dMARnpQ9KSZ?= =?us-ascii?Q?zXBBJnA/iNZUDw+OSWRspyLAo/Us7wpPBXjuPIUFp2co5QjIeCRcuqZkx4C0?= =?us-ascii?Q?GEl1bgLYJj/tUgXd91HEtuy7oRaCZRK1Lphtiv2TEm9UvQJP30eS2uBji/rR?= =?us-ascii?Q?3Mkm5zkuluMufycvP+fihIe5leVsIMLbtig19ZC4TvUeiuQSk8GSAuBfXgCv?= =?us-ascii?Q?bYbIMwm3m5GBXw5WrH0nP+xePsEGtM8CMtebGnRMAo1RY1MeglJzvOPFbjKJ?= =?us-ascii?Q?M3k/xINnMWv2MneIckjTsztVwu80jc6RpZc9eJxgU8XrW7NAGNG+VTQoOG9L?= =?us-ascii?Q?f6c5+nRI6yu8TLmL9bZy29i24KnZwqrrH4/916NvOXc7VppP+IFrg80xqngR?= =?us-ascii?Q?yq9FL5NDfnZ6zzXBNodwizWcY5YyIE20IrqaFp3tiCUdmnL4T5EZCsGlf1K7?= =?us-ascii?Q?4nhU0UVWfcG0oTQ/NdtvYVIIMhJt9IpTZ3EmM4tyoVgoHeQyMtISpGRo/rk/?= =?us-ascii?Q?0F2xRSjAGZFaL9FWUcvz9z8fMR/pZngm+oSTVQC9ikz67AkB58s9ZvyzV5bK?= =?us-ascii?Q?bScyh05ljZP5fshW2ZNs0HONOetNeplCWnUY700In0Rn5Ql5XTLv9Yn3oJr+?= =?us-ascii?Q?qL2H5iAPuIWnhOKPGqaFgmNiAX1EAat1UW417vM76Az1gHxpL5xm07S9dVxi?= =?us-ascii?Q?TAAXOpV7dFWM0LQGgF1NcJcMtLVzL6bsFJyj04Z46WxNpLMP70JsdIRQUUdS?= =?us-ascii?Q?9LrtV1rP6yuYF+xGshbV3CvaCEel1BDssfUcZtzPVc6JUP8oZUsihPbxPNMh?= =?us-ascii?Q?RJcTbsOV3+dShfxLRxJGGVDmkTK3SfVOMDc8VxL2gyV7NxVaZKxHRef5TdY+?= =?us-ascii?Q?CZrIc1KN1fW2rHUazh8LOmV8OIkmjG2E0KgOncY8iuK8y3MDyG8wiNUDKYrf?= =?us-ascii?Q?zTLe7Mq/eFoKU+cS6e+iWYaqqYv17SQUmeS4NtfL?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: c7c5b595-3216-4fa1-fc88-08db9480a081 X-MS-Exchange-CrossTenant-AuthSource: BYAPR12MB3176.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Aug 2023 00:20:37.4699 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: zi2l/yFlNQf807rgtJT9QQZhkxU+TIZFmETkMxXhKPNEHDAMjwHAcpRlcQ4v7Bgmhkk/JfHbCDg+KhPOAw6nFQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4567 X-Rspamd-Queue-Id: 615CE1A000D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: dkwwrbu3fcgn1kyfo8pe1jqgdgb6geew X-HE-Tag: 1691108440-943460 X-HE-Meta: U2FsdGVkX18/TuU07HeYtDWav3rvuNHKvnPLOiRGnuBIVo3SNT+ml47zqqMibIGH4xUhu6yN/Vi6xZx14TfEPoXrVr/mCfv2QwfNnzsRaRfTg/LU5tBh8abYLgu6luHdi3ghk7OyW8aRYilMH/hWb9WGwFMYDT+QoVAWSaN1s07KhStYS+V+eH5mKE1YvyEU6cwy2Chpeor9M93/4FH+Jo4LKj+5b+GweyzwIpBEXh7W4z3DBY8B5IhIPkaeYcEoEqdezH7YFF4sO9E2IBOsM8VVhJVKrDzEF9OCU0XQMRjH4Qq5aLP+pqJgxCBKgprqxprSoKvy+V+c2J1Qugj4JW7SsuQJFfjZi7TUE9EXKKFmUKUiCb5ib8cQL/6kVI1MWLIXHkQrgYS/EmMxgIcGNFHg3KW+lxY3bexBViufz/1/5wB09eqTMb9Yc2oikqfNYB0YG5qEKYdDTQVQMLLAGn4sRQ60ksfJ4B3wFeBbXnGjOXoOuQPiB93l781w8gv32prTxYAimVg2DM/i0JrbGqaEuveAlvcZcGs8dwZikdHWzuKd7doaVCFbHdY2uK1v0REKslbt1CeV0mqIWrY7CI8QB3L5zMbU5uY9TovXNrMzQ9plfjIAp19hN/r2oEJF9Fm7buZMv0sOS03uublxgWngSTwa3UF71xCoEAAM2Q6r7LxdsrstH+GnlYYLY7/CC5y7hmYmNbezsfUrJ/OXk7kV8VakJjlwmR2Ulj6NCFX0Y+AVWnmOPEWaHctgSSXptBKptKC3SCpYlD5uITlXt0S4C1mAvCBRbURHzGWXwlGiYpImkbQfH+tDyPfPQtuBEBPC9h+iQskFx86w9J+urlzQVlvOx2EgVy50apUFVG97EHRdXdTTvmcHr1h2SBGcBkH9js/rnlnugn+/BPd+gJllBCC2DIdRrQnDq8S2UQvRR8gmdwX7dC1F1jtntUfz0nh4snGJ0FhEEjVsFrD vIElVWbk Gqb2UgfC4P//zj/bhYXdXS9z/dx/uSmoczuy7IhWzhXQMy5E3Cn6rjcOaTrDqegzbZ6kag66Mf0nUWm9R+y7ovR14CGRNi0NJw3vy4MJqv8/tU9vQGOrCQca5aQgPqMVH659bv69RzHwuM0/+7zZTnxNktK9S2/TNJvWKOzYN8mKVlMb9UthJzubhS9KVIPsxsMO7e3iVdYCwbRVGf8YwZ/3J/qMzF/LcG7LFhG6S5bd2Pt/lzvSWu8U6U28O6wnculvli538TEenLAd9bHkGOc6Hdj6lSipwP7tSoQ+J0WhNUxDTpDFys/oU/kW5Y13OVQ51 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000013, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: David Hildenbrand writes: > On 03.08.23 14:14, Jason Gunthorpe wrote: >> On Thu, Aug 03, 2023 at 07:35:51AM +0000, Kasireddy, Vivek wrote: >>> Hi Jason, >>> >>>>>> Right, the "the zero pages are changed into writable pages" in your >>>>>> above comment just might not apply, because there won't be any page >>>>>> replacement (hopefully :) ). >>>> >>>>> If the page replacement does not happen when there are new writes to the >>>>> area where the hole previously existed, then would we still get an >>>> invalidate >>>>> when this happens? Is there any other way to get notified when the zeroed >>>>> page is written to if the invalidate does not get triggered? >>>> >>>> What David is saying is that memfd does not use the zero page >>>> optimization for hole punches. Any access to the memory, including >>>> read-only access through hmm_range_fault() will allocate unique >>>> pages. Since there is no zero page and no zero-page replacement there >>>> is no issue with invalidations. >> >>> It looks like even with hmm_range_fault(), the invalidate does not get >>> triggered when the hole is refilled with new pages because of writes. >>> This is probably because hmm_range_fault() does not fault in any pages >>> that get invalidated later when writes occur. >> hmm_range_fault() returns the current content of the VMAs, or it >> faults. If it returns pages then it came from one of these two places. >> If your VMA is incoherent with what you are doing then you have >> bigger >> problems, or maybe you found a bug. Note it will only fault in pages if HMM_PFN_REQ_FAULT is specified. You are setting that however you aren't setting HMM_PFN_REQ_WRITE which is what would trigger a fault to bring in the new pages. Does setting that fix the issue you are seeing? >>> The above log messages are seen immediately after the hole is punched. As >>> you can see, hmm_range_fault() returns the pfns of old pages and not zero >>> pages. And, I see the below messages (with patch #2 in this series applied) >>> as the hole is refilled after writes: >> I don't know what you are doing, but it is something wrong or you've >> found a bug in the memfds. > > > Maybe THP is involved? I recently had to dig that out for an internal > discussion: > > "Currently when truncating shmem file, if the range is partial of THP > (start or end is in the middle of THP), the pages actually will just get > cleared rather than being freed unless the range cover the whole THP. > Even though all the subpages are truncated (randomly or sequentially), > the THP may still be kept in page cache. This might be fine for some > usecases which prefer preserving THP." > > My recollection is that this behavior was never changed. > > https://lore.kernel.org/all/1575420174-19171-1-git-send-email-yang.shi@linux.alibaba.com/