From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EB21C433EF for ; Mon, 22 Nov 2021 13:59:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B23266B0071; Mon, 22 Nov 2021 08:59:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AAC3C6B0072; Mon, 22 Nov 2021 08:59:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 925416B0073; Mon, 22 Nov 2021 08:59:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0053.hostedemail.com [216.40.44.53]) by kanga.kvack.org (Postfix) with ESMTP id 7BC766B0071 for ; Mon, 22 Nov 2021 08:59:40 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 442438BEF3 for ; Mon, 22 Nov 2021 13:59:30 +0000 (UTC) X-FDA: 78836723700.18.E073C8D Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by imf25.hostedemail.com (Postfix) with ESMTP id A655FB000183 for ; Mon, 22 Nov 2021 13:59:27 +0000 (UTC) Received: by mail-lf1-f53.google.com with SMTP id k37so81407172lfv.3 for ; Mon, 22 Nov 2021 05:59:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Q7OhoPungBQlpmy/07g8Fe+342nGWhdV6y8JU2nGv18=; b=vT0USDJF2fCKLhM7AtmhHZ6TqmNhALxV9Bh1G9113KXxZPkG5aIQh+e3yJp+NMyEgB yyjush06rRDAKf6xyUA+EMVlTfQFUDAid0CUtMtmeUtHix0pvnF7sbVcje0OV7nvBw9a g/8GZ6NHc48vcDLcr1f3fW01yrRMGBsBbjITwRurbUP5g1aabFr4PnvcR9GVkDnhTJOZ I3gDHccbJ7qYXpaRZGgxrspIfnGqi2qJLDQzshng4iQv3se7ahPPk7IgwWM5bY1jozMl SfrrFIYhsGdWwbB4Wq4Hi54Cs40biyzXYNHrpN0Wwyms/lv0ik/a9vUpy/9oPybPTkwE XN5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Q7OhoPungBQlpmy/07g8Fe+342nGWhdV6y8JU2nGv18=; b=jr+mpuwk9v1XoikGAevQAaaqmCJw6UTKHwX7bavlJrpY7JoD1gtNHShuqKLxGaUMlF /5Etttd6KtsegN0pTCFhPCWLJZzkFkemOBuEJEuV7JBk3CQtTWS1hXYajzARYQ9Q27Tx 9qdWvh0eRo9qr7fNWSQZDNjvJvykiYfKxCzUVZoTJZqcSF1hN6ktGV2qqHQ6pWzvSNhj mGLEcjdk/+2/l4PxqS/Nl5L2/EAzQYjEDIddJIK37AdOESTrCdaIuvoyjfIafKYxX30r vCN9S3jmlvQ+4/TpgziZrvdoWrGzn2je6QjhzDP3AxIdHs0E1EykppICPCEvO9LbEfu8 dVzQ== X-Gm-Message-State: AOAM533+pJ/Vb015djhSLsPvYK+plgIpp5bNN559b8qA4ACvqH/ZxrHZ kodzP7GfZKwbh2e/8qItNH+fEg== X-Google-Smtp-Source: ABdhPJwOSduIL/T2HcuHSSI/wDp1Ip4IJTwBbrmlaHHjfVN4PCNHDjahR2ZbiKbNVp+QmDNNUEuLww== X-Received: by 2002:a2e:9545:: with SMTP id t5mr51763112ljh.225.1637589567919; Mon, 22 Nov 2021 05:59:27 -0800 (PST) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id i17sm967582lfe.281.2021.11.22.05.59.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Nov 2021 05:59:27 -0800 (PST) Received: by box.localdomain (Postfix, from userid 1000) id AD905103610; Mon, 22 Nov 2021 16:59:33 +0300 (+03) Date: Mon, 22 Nov 2021 16:59:33 +0300 From: "Kirill A. Shutemov" To: David Hildenbrand Cc: Chao Peng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, qemu-devel@nongnu.org, Wanpeng Li , luto@kernel.org, "J . Bruce Fields" , dave.hansen@intel.com, "H . Peter Anvin" , ak@linux.intel.com, Jonathan Corbet , Joerg Roedel , x86@kernel.org, Hugh Dickins , Ingo Molnar , Borislav Petkov , jun.nakajima@intel.com, Thomas Gleixner , Vitaly Kuznetsov , Jim Mattson , Sean Christopherson , susie.li@intel.com, Jeff Layton , john.ji@intel.com, Yu Zhang , Paolo Bonzini , Andrew Morton , "Kirill A . Shutemov" Subject: Re: [RFC v2 PATCH 01/13] mm/shmem: Introduce F_SEAL_GUEST Message-ID: <20211122135933.arjxpl7wyskkwvwv@box.shutemov.name> References: <20211119134739.20218-1-chao.p.peng@linux.intel.com> <20211119134739.20218-2-chao.p.peng@linux.intel.com> <942e0dd6-e426-06f6-7b6c-0e80d23c27e6@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <942e0dd6-e426-06f6-7b6c-0e80d23c27e6@redhat.com> X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: A655FB000183 X-Stat-Signature: 7mb5sqkaa69akcyx8roq31kmj7oh813k Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=shutemov-name.20210112.gappssmtp.com header.s=20210112 header.b=vT0USDJF; dmarc=none; spf=none (imf25.hostedemail.com: domain of kirill@shutemov.name has no SPF policy when checking 209.85.167.53) smtp.mailfrom=kirill@shutemov.name X-HE-Tag: 1637589567-849027 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 19, 2021 at 02:51:11PM +0100, David Hildenbrand wrote: > On 19.11.21 14:47, Chao Peng wrote: > > From: "Kirill A. Shutemov" > > > > The new seal type provides semantics required for KVM guest private > > memory support. A file descriptor with the seal set is going to be used > > as source of guest memory in confidential computing environments such as > > Intel TDX and AMD SEV. > > > > F_SEAL_GUEST can only be set on empty memfd. After the seal is set > > userspace cannot read, write or mmap the memfd. > > > > Userspace is in charge of guest memory lifecycle: it can allocate the > > memory with falloc or punch hole to free memory from the guest. > > > > The file descriptor passed down to KVM as guest memory backend. KVM > > register itself as the owner of the memfd via memfd_register_guest(). > > > > KVM provides callback that needed to be called on fallocate and punch > > hole. > > > > memfd_register_guest() returns callbacks that need be used for > > requesting a new page from memfd. > > > > Repeating the feedback I already shared in a private mail thread: > > > As long as page migration / swapping is not supported, these pages > behave like any longterm pinned pages (e.g., VFIO) or secretmem pages. > > 1. These pages are not MOVABLE. They must not end up on ZONE_MOVABLE or > MIGRATE_CMA. > > That should be easy to handle, you have to adjust the gfp_mask to > mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); > just as mm/secretmem.c:secretmem_file_create() does. Okay, fair enough. mapping_set_unevictable() also makes sesne. > 2. These pages behave like mlocked pages and should be accounted as such. > > This is probably where the accounting "fun" starts, but maybe it's > easier than I think to handle. > > See mm/secretmem.c:secretmem_mmap(), where we account the pages as > VM_LOCKED and will consequently check per-process mlock limits. As we > don't mmap(), the same approach cannot be reused. > > See drivers/vfio/vfio_iommu_type1.c:vfio_pin_map_dma() and > vfio_pin_pages_remote() on how to manually account via mm->locked_vm . > > But it's a bit hairy because these pages are not actually mapped into > the page tables of the MM, so it might need some thought. Similarly, > these pages actually behave like "pinned" (as in mm->pinned_vm), but we > just don't increase the refcount AFAIR. Again, accounting really is a > bit hairy ... Accounting is fun indeed. Non-mapped mlocked memory is going to be confusing. Hm... I will look closer. -- Kirill A. Shutemov