From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF5DDC3601B for ; Thu, 3 Apr 2025 09:12:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4598280003; Thu, 3 Apr 2025 05:12:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF25D280001; Thu, 3 Apr 2025 05:12:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B937A280003; Thu, 3 Apr 2025 05:12:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 99E75280001 for ; Thu, 3 Apr 2025 05:12:14 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id C6202121CB1 for ; Thu, 3 Apr 2025 09:12:14 +0000 (UTC) X-FDA: 83292166188.24.965FBDD Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) by imf24.hostedemail.com (Postfix) with ESMTP id F2051180009 for ; Thu, 3 Apr 2025 09:12:12 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ALyisDcn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of tabba@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743671533; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1rELpBSWL93L3Qmf+liKQ1Tp2n48zGMk1GGe66iVP+M=; b=K2nw/3YKuwsYYeMyu6avqerwWAW46IzutfjX1RIGv/LEbdGyOp2Fy03TtixtncwdHh2hUt 3sPhjF2t1mp7yVimXMWj/BZqyVBenTva/2hG/fq7joRo//Aqs7HniR6uP1sYcisCKxzQFg FNADAIP/bMUdpBK6QqR6snhg5qBRUqY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ALyisDcn; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of tabba@google.com designates 209.85.160.174 as permitted sender) smtp.mailfrom=tabba@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743671533; a=rsa-sha256; cv=none; b=buU4O0kmNc+PjGJIMMXeCOB6DKC/rYo4mu6EebTAK9nmOrHmRHyrlv8Z4TL4+0NN5gXVUF doLbVcj3Zfk4gqzzfVViHMXUli1OPkJU0GgcheZhfkafxR9xwQ2+rCagfjeqv2QmvOLLHB abEoOAzqCrARQFxyzRvRze1A7lbKRuk= Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-47666573242so857981cf.0 for ; Thu, 03 Apr 2025 02:12:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743671532; x=1744276332; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1rELpBSWL93L3Qmf+liKQ1Tp2n48zGMk1GGe66iVP+M=; b=ALyisDcnM+dnrOQ3LNvkCqE7OlQ3EBzoz/2qX8eQyHg52dmBJv5b4Re2UIx/Hx090L qIOlS2ESJOXoQZqUsMjNc4P6Uiwj5fCPRckaL4RSlt8MZdReHMgKttuhPUHptkw0lgHS HIoEBgTjOUN4YQqLbfc4wCQgwE49HVcd3AlRGeiRoE9gFZtiY0c4ypmTzX62DnVeCg2u aLxdaJcNuz2PmLf456BMPx7SKsobyhCQbWuBV+noyVirfvtm89gKsvhr5tw7+LUAGkEv zgnXNn3PfhopjwKezNnm5XMMplGnGF4YNywAvNccEdFUPjzQSXC8LF5mrRNAX5QHGeDe Gg8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743671532; x=1744276332; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1rELpBSWL93L3Qmf+liKQ1Tp2n48zGMk1GGe66iVP+M=; b=u3urd77ySUCGi5M/+g2aJGHfiq7tbYrtRfc9uIAaxQnSi5KoNn/KQp66uIDu/u1s/M hFXG33xvNBwA8mQ90vWgqRkZfzFaRi57Ww0Uc7CqTx4IJG7uOYP8XuuMpjIFWo6KXGKr v1aTwToVYGs2cSFhjxOZuIgkt1D5rfpaIgCMvr5cKfZVMu8sTZoGbKUR/NqGYWKVGEuS g2ebgd+pgBkcRG6eM7Kd+a2Y8ToO9XXJxgyAod+UYKZqyTfv7xAajyYG+VfRIU9Ev6+X NGV1cEFD3I8RO+FJH6LamI71uAmK2POU2EqsSsmzHl5gM97WoLtFc2VZnOzFKPNhCW5j xSmg== X-Forwarded-Encrypted: i=1; AJvYcCU4cof3w8m9fHOJf6BWOsdVNxUq7yeUHYMc2ZGwR7Ibb7OM3Ts5fDUZ518HMN7aTk7as0GrCDcPbg==@kvack.org X-Gm-Message-State: AOJu0YyYGUaGuC7FhQEcr8NbufnHshMwzHtOTZ6jNv//TfioGnSeqdiX PrQQLtoADHppA6pocxaypjlGTpPyz7eF0YInNCnMzDKiBNxFbicX5xk27SZLYyrj8+PsjvLztLL fP+ONdgqE+ImpdHkuQ0Ih9NQ/5AtuFbMYvP1W X-Gm-Gg: ASbGncvVj7fdHTMc/O36/AkfcHfxZmsQnzfHtSZ90BKBrQoyPH7jSnjXihwwKUAmpc1 hsVov//JpPL77Vqj9wxNJeVC9Am0r+COAWd5MM0SW4SDXcJBz/R57YGbYStallM988cErj9jzmG 5S9W57e/qrkees0yoMfGoFBrj4Fw== X-Google-Smtp-Source: AGHT+IF4kB/s8i4c362AwdMhGDuPZE+D+p2vRBRf7BHqtiFd9OfQwqafmM0EeecgzM5iuPWsFOuwMPvjcSoPhbtGT8I= X-Received: by 2002:ac8:5a42:0:b0:477:635f:5947 with SMTP id d75a77b69052e-479173d0113mr4311361cf.12.1743671531715; Thu, 03 Apr 2025 02:12:11 -0700 (PDT) MIME-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> <20250328153133.3504118-5-tabba@google.com> In-Reply-To: From: Fuad Tabba Date: Thu, 3 Apr 2025 10:11:34 +0100 X-Gm-Features: ATxdqUG7hBVR0ezJ2edx0j_xRZWbit9EU8tds0KDQYa61k5NbMo-Btr_n-OwScA Message-ID: Subject: Re: [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition To: Sean Christopherson Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: F2051180009 X-Stat-Signature: ekbtp6fii94o5rn6wx4xwe4omj3xoo45 X-Rspam-User: X-HE-Tag: 1743671532-560006 X-HE-Meta: U2FsdGVkX1/l63OX9JLJE1n6F5dyZlDfG9Nd/MRM2QKaYxGXpgmEmkDT4kCu/0ontM/KwUVOTVOCZ/MeQqfmyiZF7j1yy37hr6w6JZ9q84UYLI4yJDlTCEcPWi1vzkxencFhO3DGLIKVkdnHXpTHEn6J5hxM7Kyx9ZVOKNunnY3W1t8kkqUvhPpyJM9RcgCm+iVUiklnk/KdvWKgr419cI01PgA5Ti0tHzda+En/VFZzWNXZ2cAACw+Hf2nCJ+uDg7VZHguJjqHciY2eplOFVawzKscaAUJ6Yz5wt63TGhPo7hi+v+ZU9PovHaRe4WQGgM7TwYIRprvniD5FQTAyBtnn+wWO/jY0UvXWau477NH/4v04+CeJz8fCNvZESY+9hao477lsiEIgkJtzVj3DBrAIro7IVq/44yn1Qhh7i4JQmVco61hvFujYS9lluej5Z7Efuq4SaZqrm/3Edody5Jat3rVHnYZlPtX3kcpWkwcwY86X0YXkwT2ytuIzJBd1KYcc2r0HIdmJzH/BV0VIozmCL2bqDSZTmemNjDKtEwf6RJ/bZz+lHY567JdHKs4DGmIJ5h3dGvL9NfdD18K43t96gfC702xvK1jOQ4bVhfvRH2eb7hKPMOAh5wAukXtnDyyqEgLJ2DPQwSRUxOREfxsurZRUA7/Lw1P4UiYLxtSkK8OSvxGc9gkQT/sLE+2BQOhd+aqRKvp7gLxCX4ZOSaSFmXMz1C1IgoCqRjyoDNJTHPrDcIdYf1jBr3qvahrKKuNx0D4oL/hosFX6Ytz5M68I/vj+qIRY66qz/aLFPBLw2GLhnJFW/BWbQBxfUUGIpB1UjSI6ec4VeKqJjlmoB6TebaeG1R+e5mhoPA1FGBLp/8VpWV/g2eEBVNUJ+AM/twm642rMArRtOmZbSbu0fo+XZkC3E2GZWsqLgl6nLrfnatUPWSy67ZYVr/0STQ5rXP7QPrOt7+3DKH+yq1G bOlyykbR n+Hj936yRpHbGoaoIysbSgsIFgUWb+UTGha3qw5R7xVvMrRxfQn9qeHTEg8c6qhCZrH6CxMRsiibMjhbSyBBalMA4/YUPO6bB28Miq62tCfsrJo8QwQ4p5F3hRl5oSKCy9A4+5NDK25AAWI5x7jXn28m3U5NZ1Y/WlTttJHKi4riiAVZlADWqWTdmpIAMfpMLAF7U9HY7UE1pWNZcABqKu1YXdjMylNOyT7aTD7jorhl1nx8h+Fs8cLMt6tT2bp6aqr/qqA0yFcm/NGuVe84OQrTFDetTxlLRNQwdlYrcnQod68BrlnlXcNtntpB3HxRdsr1LZgEBBUT931yP/9xoc8EyCg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Sean, On Thu, 3 Apr 2025 at 01:19, Sean Christopherson wrote: > > On Fri, Mar 28, 2025, Fuad Tabba wrote: > > @@ -389,22 +381,211 @@ static void kvm_gmem_init_mount(void) > > } > > > > #ifdef CONFIG_KVM_GMEM_SHARED_MEM > > -static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) > > +/* > > + * An enum of the valid folio sharing states: > > + * Bit 0: set if not shared with the guest (guest cannot fault it in) > > + * Bit 1: set if not shared with the host (host cannot fault it in) > > + */ > > +enum folio_shareability { > > + KVM_GMEM_ALL_SHARED = 0b00, /* Shared with the host and the guest. */ > > + KVM_GMEM_GUEST_SHARED = 0b10, /* Shared only with the guest. */ > > + KVM_GMEM_NONE_SHARED = 0b11, /* Not shared, transient state. */ > > Absolutely not. The proper way to define bitmasks is to use BIT(xxx). Based on > past discussions, I suspect you went this route so that the most common value > is '0' to avoid extra, but that should be an implementation detail buried deep > in the low level xarray handling, not a > > The name is also bizarre and confusing. To map memory into the guest as private, > it needs to be in KVM_GMEM_GUEST_SHARED. That's completely unworkable. > Of course, it's not at all obvious that you're actually trying to create a bitmask. > The above looks like an inverted bitmask, but then it's used as if the values don't > matter. > > return (r == KVM_GMEM_ALL_SHARED || r == KVM_GMEM_GUEST_SHARED); Ack. > Given that I can't think of a sane use case for allowing guest_memfd to be mapped > into the host but not the guest (modulo temporary demand paging scenarios), I > think all we need is: > > KVM_GMEM_SHARED = BIT(0), > KVM_GMEM_INVALID = BIT(1), We need the third state for the transient case, i.e., when a page is transitioning from being shared with the host to going back to private, in order to ensure that neither the guest nor the host can install a mapping/fault it in. But I see your point. > As for optimizing xarray storage, assuming it's actually a bitmask, simply let > KVM specify which bits to invert when storing/loading to/from the xarray so that > KVM can optimize storage for the most common value (which is presumably > KVM_GEM_SHARED on arm64?). > > If KVM_GMEM_SHARED is the desired "default", invert bit 0, otherwise dont. If > for some reason we get to a state where the default value is multiple bits, the > inversion trick still works. E.g. if KVM_GMEM_SHARED where a composite value, > then invert bits 0 and 1. The polarity shenanigans should be easy to hide in two > low level macros/helpers. Ack. > > +/* > > + * Returns true if the folio is shared with the host and the guest. > > This is a superfluous comment. Simple predicates should be self-explanatory > based on function name alone. > > > + * > > + * Must be called with the offsets_lock lock held. > > Drop these types of comments and document through code, i.e. via lockdep > assertions (which you already have). Ack. > > + */ > > +static bool kvm_gmem_offset_is_shared(struct inode *inode, pgoff_t index) > > +{ > > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > > + unsigned long r; > > + > > + lockdep_assert_held(offsets_lock); > > > > - /* For now, VMs that support shared memory share all their memory. */ > > - return kvm_arch_gmem_supports_shared_mem(gmem->kvm); > > + r = xa_to_value(xa_load(shared_offsets, index)); > > + > > + return r == KVM_GMEM_ALL_SHARED; > > +} > > + > > +/* > > + * Returns true if the folio is shared with the guest (not transitioning). > > + * > > + * Must be called with the offsets_lock lock held. > > See above. Ack. > > static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) > > This should be something like kvm_gmem_fault_shared() make it abundantly clear > what's being done. Because it too me a few looks to realize this is faulting > memory into host userspace, not into the guest. Ack. Thanks! /fuad