From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A42CC00140 for ; Mon, 8 Aug 2022 21:21:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8A30C8E0001; Mon, 8 Aug 2022 17:21:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 82AC16B0075; Mon, 8 Aug 2022 17:21:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67E3F8E0001; Mon, 8 Aug 2022 17:21:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 55DDA6B0074 for ; Mon, 8 Aug 2022 17:21:35 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 316A8120143 for ; Mon, 8 Aug 2022 21:21:35 +0000 (UTC) X-FDA: 79777696950.19.DD63CDB Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2083.outbound.protection.outlook.com [40.107.100.83]) by imf12.hostedemail.com (Postfix) with ESMTP id 8AE0340059 for ; Mon, 8 Aug 2022 21:21:34 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hhXuat0ZPtxixeezHoEZGFWoT7D4keBCtjakfrV91U+MCFAC5UBuMVyFB1lh+3OIkOjleSbMRXcQGhX2/r45Ac9NdTMyAGugS2H/UzMgZ660EQnhxfVBt0gMWU53K7dpnYIqzZFIDu9Lmt9dqfHmqVK7HfxkTiljsFdPo3KTmr+n/+JMi1JZ9UN5chQy6ThFisktZp2Mn8wNnyXbgWePJg4nJ7X6bRJP+w/7JgIbB8xyO6JRvD+gxlPOUqOiVpdsHQLOz5Bd7TT30O2pf/KQgMn8eyfskTDwBOu1bNypKDckeS95/GgG+0hAwWewVEL2eVd1piQ72nHLIihY4bdBNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yPF15UcqP+ueuRLe2vcPly/DocwLN86oigqsAGeqwwk=; b=i0k9JWUtKQnNhRghek4egnWHXNqwGHiyzSBS/rpWHShLSPj53aPNWMXXwaBTwj/vl5FquZ/O/NVXhrMyjhRIDqRnsWh8kcrTXV9kgDqjPLHM6A1P2W91JO3tw/ll2RgF3gBF1b8wvxadz8YGy3DafIxmxa1KKK5ZHgsv87esvm/Gp6XlR8cmbDTPSBRWNei4fWkkfjYGPWUcBz8fEtdpMjdSrebFDdrkzq4q1yyr2Dy6tfOErUN7uDQnhQKQTfHwmP6kvsvReHr4f5mHWVDOQlTgCxN9TLxg0SJcmfE77+JG8/V3qrIc3xBrDaPvEGcJZAOkxKD+QLpP22OIR6YOrw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=memverge.com; dmarc=pass action=none header.from=memverge.com; dkim=pass header.d=memverge.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=memverge.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yPF15UcqP+ueuRLe2vcPly/DocwLN86oigqsAGeqwwk=; b=wImBFBVItu451xmJyWpDg0zWbSnDLFinA9jOtQwstG8pnYXetEznCDzcl9Nm7zvgMLmpJwd4c2Bn4lwwbNLrHAoGUUHpeSyx//0crGH3/LUgZ/ZX4elQiRCo7RAXMAA4NfxA0bum/z7NZQymeLBXvK8lT8qHxOLlooMRFM4D0+g= Received: from PH0PR17MB4922.namprd17.prod.outlook.com (2603:10b6:510:d7::22) by CY4PR17MB1528.namprd17.prod.outlook.com (2603:10b6:903:146::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.20; Mon, 8 Aug 2022 21:21:31 +0000 Received: from PH0PR17MB4922.namprd17.prod.outlook.com ([fe80::5ca7:2b89:7f14:b6fb]) by PH0PR17MB4922.namprd17.prod.outlook.com ([fe80::5ca7:2b89:7f14:b6fb%8]) with mapi id 15.20.5504.020; Mon, 8 Aug 2022 21:21:31 +0000 Date: Tue, 9 Aug 2022 02:51:17 +0530 From: Srinivas Aji To: David Hildenbrand Cc: Linux MM , Dan Williams , Vivek Goyal , David Woodhouse , "Gowans, James" , Yue Li , Beau Beauchamp Subject: Re: [RFC PATCH 0/4] Allow persistent data on DAX device being used as KMEM Message-ID: References: <922eda33-be7b-f413-6285-33ed0ea0f09e@redhat.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <922eda33-be7b-f413-6285-33ed0ea0f09e@redhat.com> X-ClientProxiedBy: MA1P287CA0017.INDP287.PROD.OUTLOOK.COM (2603:1096:a00:35::36) To PH0PR17MB4922.namprd17.prod.outlook.com (2603:10b6:510:d7::22) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 3005a46c-7153-4783-0544-08da7983f6b6 X-MS-TrafficTypeDiagnostic: CY4PR17MB1528:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: nl5b3cr47Ezv5LqeuSDVrSNMsXVtcw3Vx2oHLMT9MiIJ/HqaIn/5QH/W/lEmqJsYEBulXoNvFtio3GHzcIzVNe2UlXC+6zsIFDigVTaDWtEwoNxJMyL1vDBHQ8pssRkWiS8jDaASSPY+el/S4VBX0LNmOCm4O0W8ueQHni+j1usdHeB71q23nqQ3lQ/IGkbkqaWKxrb+qH+EgWAzgahQfX4ixLWPMmisOMSzzl3oFCWKaZBMwOZPZ2CDVq9UIDJ/db8jkbiMlvTHExg1UmT2DVEP+rmjzwMvkzwgrQXjeCsNycEqTUskl2krNDKUrFPzKUv9Oe8KJVjmWJ3dFaZjsam5mAtfY+NPndwAKaS+38ELdymHiUJO5ADIIkYxj6oeJA6K/pcBPjGPEVtjLjn3bLsRgZjKvCmps+lnEJCzQFdFVZRR54eVGMPHU+bS42u9nH/fb4ApQ1QkN+ZrZ3sgo3ws0s3UokuYLZTEOkwIG7QoNV27DccQhtH4KXvSM6cya8qB+RIuPkOU7v7Rs1RX9VRocetWedBzwGUz4ijhOw5mxO8fK+vwZWKXTLwHJG9eRCsSbmjri39J8y27W8f7L+aa2c3JEsdDvCHcbBll+XXQe/5AMYtx/lX+bC0gl/g1+rE4DcS6/EWViQLLC4ww3KR281qTSw6XkFdnHXndklHCjK3qKMAQgtQbCdkCfgBYXwhtwGnRFlhnTfikBjVSJIxb9GFdaIcq8U05WOaREK8ZU/80N6jXUKEJOGQ+n5bVYXtJU6zIn65h8HqjrnrDdFBaMrwyFSBEL4BQ4dQltL+bHyEJI3uk2W76chByMAKb X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH0PR17MB4922.namprd17.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(376002)(39830400003)(366004)(346002)(396003)(136003)(478600001)(6916009)(316002)(186003)(6486002)(86362001)(5660300002)(8936002)(107886003)(54906003)(2616005)(38100700002)(66946007)(44832011)(38350700002)(66556008)(66476007)(4326008)(8676002)(83380400001)(52116002)(6506007)(6512007)(41300700001)(2906002)(26005)(6666004)(36756003)(67856001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?rHeREYKfRW64+iGoxzB7myejW0f9ch9gQMRgHrOKSonu60w/5SbnYhvV4BOC?= =?us-ascii?Q?SHaUkFk7hPMCN+zaZdX0N9hAGGIEpb0577gCv5l2CcOUk+wYM507so0z+d66?= =?us-ascii?Q?4xkyYdsLKkPtb4jAIRkpbeMLOfrDrQ38QBiYzP/wsdhPPXtUTKUCCfmEKQdK?= =?us-ascii?Q?/g0lzNNQr28pOKNE7HPYSz2n594fZTxTkrwMrPjOG4dKNu4eZuHmTDI8gdhN?= =?us-ascii?Q?z/bMNYgsjNAAx9OyXux5X/PCFtv+J9GMf+dq35S2HGKnbd7s2MX42NkLYjYi?= =?us-ascii?Q?vFFhPLQh0Qsz61DG0uNX7CcJgq2VLGbpKOuy561ofarU3IvflR6ovEkcKZTP?= =?us-ascii?Q?CbgxuFIteCDMaWnYlFx7BThl2JsunAJSsrdsQTyCh3VKW0wfMjhBNsQ2ElGA?= =?us-ascii?Q?Zf/Xy2SDOwmv+wBuKLnJyuSsDj6rETwZFBlwio1FSL2lR54eW0OVdUeyt4Lm?= =?us-ascii?Q?JxP8ssS0cvN8u/D++YIqzDbwm8cqQX4CNgohk0TPeTZyKGRt1hIFmL8nFLTx?= =?us-ascii?Q?I2hpHdhQwfUMK8RS2FgvXkxrBxN39e4zKWzaivyKPBLWq5cOVjagDXQUkC6U?= =?us-ascii?Q?WLGl8gO9yqdjIy1BCaPcekmTbaAHddNnjCJ10crPYFjTHmsU735+DLDMis9c?= =?us-ascii?Q?T24OjY1SqYrJDaRxkczzMYLfKcOD5jjgURZgokhmDpsO9bJ/zTeNdtAfqrjk?= =?us-ascii?Q?b+moYbeBH2c+5/cWTq+LUGh+4xBh34+AUNKlBTkx15//7sqX6mHAbfkgbnbB?= =?us-ascii?Q?tulsBeNgoEBhTXlFnlzy1UfMSXTKe/qFDkX73yHKkN18wKdYeuJMfmH4LVrE?= =?us-ascii?Q?cbhlKHvd3SlM3LCPowTFsPfh5Had2462d0RFdNcewbhLYSPevxoKESOl6jf8?= =?us-ascii?Q?/891cMMxNfZW5+KfRmSdDjJtVKTcz/iSbAe1a0qEU4QJJu+68gqf8Tt7bciD?= =?us-ascii?Q?rxg8h8p/ni2TH2FcgmdZN4qmSaXNbPn9obTw2+VJT7C4vTsPbXLD5cWYlYwn?= =?us-ascii?Q?aNdq59RunJdniQRNDaLYX2Qxh8cZBX1GcpgYO4cAcWPL9roc2R9UswJZNEK3?= =?us-ascii?Q?UDexxrf6stNqarQCA8szDgcq2khEvwqlqtSP94dtmvMEmLkgLzQ4WrGj3Mdn?= =?us-ascii?Q?m+n3NsOHbG83dB6+NehHDCzZ72miwljV5+dC7W+9ImGUWByDAnFeUbiVGLLz?= =?us-ascii?Q?29rZQ0kRZs+fuHhY4PkLxvYVkCEllACWsEgV7zcz5OFiVooa05cjpgGfKSWw?= =?us-ascii?Q?Y1UAs4AUdf2H4gYlecagzruoTENL9M2+1VxlssPxXcVHbUKQSZyGUi60RXQg?= =?us-ascii?Q?xjssxHC1I0SMiKCZiw6O8YHQ1RxMBxQjr7UXDcwUqOPax06aP2dKqbAKy6XU?= =?us-ascii?Q?c+hC720LBU70+x9ITm4Yvd2fk8c8xf/B2lYMLZsYOjlK0WcwiP4OrAviaD9+?= =?us-ascii?Q?VtLz/GdgiFTY6lPxpjcNljAZBI+q3r3qdidn3hOhBEADPvykM5Gs5WhIReVX?= =?us-ascii?Q?huncVIh6YzqgzXos/jk3f1E7tS35SWcoeoxCrjVVNbQRjP/x71vdBhXWPjin?= =?us-ascii?Q?j/KNyQaDfmR9GOMlaupfaZgc1ufz7BNagtw0bBvr+5z/pmAhsFu8aB1UCPD1?= =?us-ascii?Q?ZA=3D=3D?= X-OriginatorOrg: memverge.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3005a46c-7153-4783-0544-08da7983f6b6 X-MS-Exchange-CrossTenant-AuthSource: PH0PR17MB4922.namprd17.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Aug 2022 21:21:31.4553 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 5c90cb59-37e7-4c81-9c07-00473d5fb682 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: yfoNc3q2oJsmnHUDKumlJISKYEOzS03Lj0DddOILO8B/2uRvQf130nVWvQ7chSq+esRbPgWBQXdH2V/tMtmsvlcyx7163y+0kr6nJdSVKEI= X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR17MB1528 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1659993694; a=rsa-sha256; cv=pass; b=KquzqvTVGABs4zzNlQa+48R7OJt4x3acxWCnpdeuKcVfsqVrkm1LEZoYtUtALEHu5i4HVA P0ZZa4MRfaLSDDHX2suQT9mbcJO46hPVtar5jg0fiD0nz/mJuFZVFkgz6wDnKyLnr6qAHQ 8pZSMznXAIZ/4GOcm/YqGpVyNI2zbnY= ARC-Authentication-Results: i=2; imf12.hostedemail.com; dkim=pass header.d=memverge.com header.s=selector2 header.b=wImBFBVI; spf=pass (imf12.hostedemail.com: domain of srinivas.aji@memverge.com designates 40.107.100.83 as permitted sender) smtp.mailfrom=srinivas.aji@memverge.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=none) header.from=memverge.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659993694; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yPF15UcqP+ueuRLe2vcPly/DocwLN86oigqsAGeqwwk=; b=SlCc+9vyzHmpVasWJjbs1nj1GXWbGqcnMRyP6NuOhzrvKWw1e15lJSX4Ixcc6sgrq3O1EK W2vJYIBcO+f768quVWGheZHI+EoRYNtdRKC5GgyHEjciRcMSL8hQyl9bdjnAi6CI7nayJl egMBVhJ4OVaMIBpGz0MQVRSxJ+rOw9A= X-Rspamd-Queue-Id: 8AE0340059 X-Rspam-User: X-Rspamd-Server: rspam11 Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=memverge.com header.s=selector2 header.b=wImBFBVI; spf=pass (imf12.hostedemail.com: domain of srinivas.aji@memverge.com designates 40.107.100.83 as permitted sender) smtp.mailfrom=srinivas.aji@memverge.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=none) header.from=memverge.com X-Stat-Signature: q57iorq8g3df1mta4o55x41u36d18fdk X-HE-Tag: 1659993694-940595 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 05, 2022 at 02:46:26PM +0200, David Hildenbrand wrote: > Can you explain how "zero copy snapshots of processes" would work, both > > a) From a user space POV > b) From a kernel-internal POV > > Especially, what I get is that you have a filesystem on that memory > region, and all memory that is not used for filesystem blocks can be > used as ordinary system RAM (a little like shmem, but restricted to dax > memory regions?). > > But how does this interact with zero-copy snapshots? > > I feel like I am missing one piece where we really need system RAM as > part of the bigger picture. Hopefully it's not some hack that converts > system RAM to file system blocks :) My proposal probably falls into this category. The idea is that if we have the persistent filesystem in the same space as system RAM, we could make most of the process pages part of a snapshot file by holding references to the these pages and making the pages copy-on-write for the process, in about the same way a forked child would. (I still don't have this piece fully worked out. May be there are reasons why this won't work or will make something else difficult, and that is why you are advising against it.) Regarding the userspace and kernel POV: The userspace operation would be that the process tries to save or restore its pages using vmsplice(). In the kernel, this would be implemented using a filesystem which shares pages with system RAM and uses a zero-copy COW mechanism for those process pages which can be shared with the filesystem. I had earlier been thinking of having a different interface to the kernel, which creates a file with only those memory pages which can be saved using COW and also indicates to the caller which pages have actually been saved. But having a vmsplice implementation which does COW as far as possible keeps the userspace process indicating the desired function (saving or restoring memory pages) and the kernel implementation handling the zero copy as an optimization where possible. Thanks, Srinivas