From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAECEC00140 for ; Fri, 5 Aug 2022 12:46:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C57F8E0002; Fri, 5 Aug 2022 08:46:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 276DF8E0001; Fri, 5 Aug 2022 08:46:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F0E58E0002; Fri, 5 Aug 2022 08:46:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EE23E8E0001 for ; Fri, 5 Aug 2022 08:46:31 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C1939C05ED for ; Fri, 5 Aug 2022 12:46:31 +0000 (UTC) X-FDA: 79765512582.21.1C7E94C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 518C21A010E for ; Fri, 5 Aug 2022 12:46:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1659703590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CVI32sxyYm2v1C5Fwo3cAYnkr3DxW6JX4M+f07bTJL4=; b=XX8sh6/iYsKXIE3w1aYIABIHMI3kd3kg+naP4ogh6sodx1ZWsHKhKg3PTnvf4Sb3vMXPCR YOinINW1mLsMdHSDw96xO+VGtWp5N1E7GMRhyv6ofTJWI4mguM1jbgbs4pzYUPwRqCpqJ1 hwEttK3utNCaX6uT/JfK7A9PP6Q1oEQ= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-348-swT3TatsOQm83IO3zNjUTw-1; Fri, 05 Aug 2022 08:46:29 -0400 X-MC-Unique: swT3TatsOQm83IO3zNjUTw-1 Received: by mail-wm1-f69.google.com with SMTP id v11-20020a1cf70b000000b003a318238826so498533wmh.2 for ; Fri, 05 Aug 2022 05:46:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc; bh=CVI32sxyYm2v1C5Fwo3cAYnkr3DxW6JX4M+f07bTJL4=; b=E92vlUWYItyBpluqyXqfSNzNHbl8rEoYLkdCu1wci0Ora8BbRDfp3TPKOden8dmJlH RwegCS2u4qDbmd1i+tndmNVYr92IlGm2RgELwDE46yb0vXomqSl8WWO26GiY395mRpiU HrBZsPb9PenNMMDrWc3TwlyBRYlXDBa3J9l5lKODB8K/cIYw2IAx42z8aqgTufKjG+mN ZCh3JXcc2UsphTmMoZM6wYWHoRJV7ZkM7f/gt9iiUx6oKEp1vwZ+FDJWwA9vdeBe3qMb s30e8wqm4UZq0TRUoKsugj2XEnKWHYUDmoBH3Pj7NlpuPpV/cc3AObiXwHTaAHfyxIqD alcA== X-Gm-Message-State: ACgBeo3mVVsUFklMNKx84g7a/UKHhERaDJlTEZ8KLsJ3RtDkZUdS4AIV WZDFbrAdQifAQkWqnG8jyAgC/x6p0di5DiZeaIRIhXQpIDZ4qpxtif7S1xy/cofGteTiwcBFFUR SsfzgGfsCPzU= X-Received: by 2002:a5d:694c:0:b0:21e:bac1:478b with SMTP id r12-20020a5d694c000000b0021ebac1478bmr4291469wrw.351.1659703588583; Fri, 05 Aug 2022 05:46:28 -0700 (PDT) X-Google-Smtp-Source: AA6agR7sD2xrG2XVqmKJGA3wd/wZoQoUmwaMYruefZ5h7/wcuebMCmk440eBx1gtPX6CNWTDmU6a+Q== X-Received: by 2002:a5d:694c:0:b0:21e:bac1:478b with SMTP id r12-20020a5d694c000000b0021ebac1478bmr4291450wrw.351.1659703588272; Fri, 05 Aug 2022 05:46:28 -0700 (PDT) Received: from ?IPV6:2003:cb:c706:fb00:f5c3:24b2:3d03:9d52? (p200300cbc706fb00f5c324b23d039d52.dip0.t-ipconnect.de. [2003:cb:c706:fb00:f5c3:24b2:3d03:9d52]) by smtp.gmail.com with ESMTPSA id b17-20020a05600010d100b002206b4df832sm3677720wrx.110.2022.08.05.05.46.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 05 Aug 2022 05:46:27 -0700 (PDT) Message-ID: <922eda33-be7b-f413-6285-33ed0ea0f09e@redhat.com> Date: Fri, 5 Aug 2022 14:46:26 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [RFC PATCH 0/4] Allow persistent data on DAX device being used as KMEM To: Srinivas Aji , linux-nvdimm , Linux MM Cc: Dan Williams , Vivek Goyal , David Woodhouse , "Gowans, James" , Yue Li , Beau Beauchamp References: From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659703591; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CVI32sxyYm2v1C5Fwo3cAYnkr3DxW6JX4M+f07bTJL4=; b=FuwQhSoH03lpEhU3vArDctajBkjBGauUw04Pz757+UsbsjfR/0pJoGqgBNw4blmj1n9yPT sdRFIydRBaoyGs7uuNaKzyCfz3AfGAFZMjytxcBgIoxdfIJ7/BPekTeV3Xf+4yE6cVZG9X C0hsHOY4/4wM2IpGUll59OTow+jPmrk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="XX8sh6/i"; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659703591; a=rsa-sha256; cv=none; b=Ygrdew6gbBmXGIOofzW1zQXS6RMsdCOVgFqOJRokGJ65sdqQ6q1sGFFDIDfnq91qRvdsjP aGJEjIdft76WsJ5lEEfI6vdnpfo8YU0eYgZH/Yfatrta/x1n6nZogRReMsUX8EnZlD4MoI IEjRcQgv4o4f5IptVzH5ibiQMuS0K8k= Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="XX8sh6/i"; spf=pass (imf19.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 518C21A010E X-Stat-Signature: 77f5qrbm1b858ap4xpz75ub6mxht5hsk X-Rspam-User: X-HE-Tag: 1659703591-235171 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 02.08.22 19:57, Srinivas Aji wrote: > Linux supports adding a DAX driver managed memory region as system > memory using the KMEM driver (from version 5.1). We would like to use > a persistent addressable memory segment as system memory and > simultaneously for storing some persistent data. > > Motivation: It is already possible to partition an NVDIMM device for > RAM and storage by creating separate regions on the device and using > one of them with KMEM and another as fsdax. This patch set is a start > to trying to get zero copy snapshots of processes which are using the > DAX device as RAM. That requires dynamically sharing pages between > process RAM and the storage within a single NVDIMM region. > > To do this, we add a layer for handling the persistent data which does > the following: > > 1. When a DAX device is added as KMEM, mark all the memory as > allocated and pass it up to a module which is aware of the storage > layout. > > 2. This module scans the memory, identifies the unused parts, and > frees those memory pages. > > 3. Further memory from this device is allocated using the kernel > memory allocation API. The memory allocation API currently allows > the allocation to be limited only based on NUMA node. So this > feature works only when the DAX device used as KMEM is the only > memory from its NUMA node. > > 4. Discarding of blocks previously used for persistent data results in > those blocks being freed to system memory. Can you explain how "zero copy snapshots of processes" would work, both a) From a user space POV b) From a kernel-internal POV Especially, what I get is that you have a filesystem on that memory region, and all memory that is not used for filesystem blocks can be used as ordinary system RAM (a little like shmem, but restricted to dax memory regions?). But how does this interact with zero-copy snapshots? I feel like I am missing one piece where we really need system RAM as part of the bigger picture. Hopefully it's not some hack that converts system RAM to file system blocks :) -- Thanks, David / dhildenb