From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3674FC36018 for ; Thu, 3 Apr 2025 00:19:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 610B1280003; Wed, 2 Apr 2025 20:19:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E78C280001; Wed, 2 Apr 2025 20:19:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AE59280003; Wed, 2 Apr 2025 20:19:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2ED4A280001 for ; Wed, 2 Apr 2025 20:19:41 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E6B571409E5 for ; Thu, 3 Apr 2025 00:19:42 +0000 (UTC) X-FDA: 83290824204.21.62D8AE5 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf06.hostedemail.com (Postfix) with ESMTP id 356B0180002 for ; Thu, 3 Apr 2025 00:19:41 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Y9Hex1a7; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3G9TtZwYKCMAykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3G9TtZwYKCMAykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743639581; a=rsa-sha256; cv=none; b=XAb0bDd6s/lnjdr63PAMTg97QziPQ8ABIOD0KGpxZ6DUrZlWPSvjmnxDCaYVUak8QbOGZE f2iPLzF7W65g5l8qQLFhROz5pRjujsnL7px2OjT38qOxLBS33959DjmgHmZJ+9RphriMx1 qvzalVu9aW3M3pVhVcPiEw2b1psxV9Q= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Y9Hex1a7; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3G9TtZwYKCMAykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3G9TtZwYKCMAykgtpimuumrk.iusrot03-ssq1giq.uxm@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743639581; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=giq31tPMYQR9GJOjdMUA5ykRfg6GYvMlM/OohhUUb8g=; b=cW3ELIzk+y87Spvk1Z2+nIu4wiI8it9JFmB0bEdzpfl42bncATgCOEWWcq25szE31EDxDK rHLJawB1iZsqmNSakY6GGS1/SAFJEifEfP4dgvq5zMFkPU48L5ZS5rFNMZVX353mPh8HQe 8zXIt2ATUn99Ea7W9nlsfS9RoFoveKU= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30566e34290so297073a91.3 for ; Wed, 02 Apr 2025 17:19:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743639580; x=1744244380; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=giq31tPMYQR9GJOjdMUA5ykRfg6GYvMlM/OohhUUb8g=; b=Y9Hex1a7WXm2KKVgT11O24QiwvhQCzvgwp5Ft7CaVCgAa1IBSDnE7gUh6ppigZAk4D DNzWOnnjFf3OsnTpQzzCLvXEL4UaRTZ6JMhrHqTwXJihJmidw2rcdkMUb0UpgTXF5Xs2 sCad8YNSw8dW5Mup3JJu2PUiCjyyI2TDcxbB1HlzE5v156odu0izVs6Wc/Pt8kRuSPqB EKtHKGDqCywi8+NBPkeiQ8itTintrIuVN62qNhN/Fgwfs2yXIHrdyM6/0Qby4JliwxBl aDPCZb2JUs7zzNvHdCJRHpU6KEPdC6i0zC6O4VcMnr0KCX05/g0Ve9+TnAuxgdNg0EE9 TJpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743639580; x=1744244380; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=giq31tPMYQR9GJOjdMUA5ykRfg6GYvMlM/OohhUUb8g=; b=DFdUeqw2fFBQp+KBPma0opru+y1BfRicRakQWJxo1LH42v4z+NceLtS+CA5LfLp8az 6O4c438oLcnLjEvRhfVuXkFsRNLmAl5iR/nSaLywUB/fyp3uWIzkwhO+tGGrVdbLECzs vueFZuRm4iV5WNdSQ8XxSiocF7l4Sn3SFq0MmTglh/68nH2FojLWCJDwgHa9BSMbvjVt wfE3of29xODGKVRzlhflJ9UlsxdiYQBBUCfhRpB+XTUgzZMH9Tj0YFAO4/iWGzkK5/6l Un8EIGstkUkhgHWHmVqOljQkHZ9C9Mug3jsSHcQ8nRFjjqYea0NzvpoFojE41Sti9OeL EYcw== X-Forwarded-Encrypted: i=1; AJvYcCVYLEl8w2JVMGZwBrQOJY/BlD8BtDZxhpKEo9dwxWSwQtzrG/WryyYbQYpOfz4wf+QgfKh9yuFfMA==@kvack.org X-Gm-Message-State: AOJu0YzrHyu39/057+DBxjMASih2dQ5gPUpEB+zpTpt5uHkMIdU5EY/C IStAEfJpyC/jQZXr2v6jqlIQ8tPY9/HDV2OuUXkB42GdqY1G07gFcXf1vR/geTkHAt+eIipls8r FSw== X-Google-Smtp-Source: AGHT+IEALDHXaWn7DLHhgLj4jdYjci9jQvHFqEsuIO7fXG0v8/sr5fTQJ+kpqo4OZSalhy2Upmggi3qyPYE= X-Received: from pjbli6.prod.google.com ([2002:a17:90b:48c6:b0:2fa:27e2:a64d]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2648:b0:2ee:90a1:5d42 with SMTP id 98e67ed59e1d1-3056eca1a6bmr7764389a91.0.1743639579925; Wed, 02 Apr 2025 17:19:39 -0700 (PDT) Date: Wed, 2 Apr 2025 17:19:38 -0700 In-Reply-To: <20250328153133.3504118-5-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> <20250328153133.3504118-5-tabba@google.com> Message-ID: Subject: Re: [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition From: Sean Christopherson To: Fuad Tabba Cc: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org, pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: 356B0180002 X-Stat-Signature: eh9xi6siccw9s5f69zhu7dywbf57wa5p X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1743639580-825480 X-HE-Meta: U2FsdGVkX1+mvgKUMlD6ukpw3jtqZ+JX7rfK3Z1ZbpPk62nHd/fZ7xWSDwVEWa7vhfuDSPwHbfC/QWZHpO8aanRNfMTMLII/sd5h3Mr0+BbXIG/33QNdyXD+LWa1LiVC+Z+OJhKNJDFgxZek42ZOYUcBUkX+t0Z7+oPhotrMB3tzl9nfXKkaQa7WbzBXE3Rcg4RbwVH7D6xdmHuFxMBKKLvNd55gkZry/wzYWNHZ58ysVjry+ZNiX11mS+mVQ9Iv4Dk7WOXSdGe500KWbk5NdIHtmSztQSDMGTt0scsZhr736M2BHDx/YoodKd9crVCPWB8m4TQuP6b2KlzS6rLXz+8Gw80Zf2oI5i720lE9PgqtBUQzebZfDV4aHUH3uPpWoTkhfn3uXvFw5kLnpZLSP1d6t8qsirYuGDEq5/PPTXBupffFReo9IJCWfEl6sxJUCjFCFbdOXc8jdS5TR9r4JzykhCcx66L0WyaWUQ935HEZ5wBMBFQaCGsWGEMl23jzI1bN8UE8cqcD1Lke9LR5sGj65gOb+d4UmsZOuKrGAHbyZonaV62LoM2YFzpPfUyKhBk0cS/6+KHJiZJD30mZz2xlS7lGvxmpik1s41lVpJsaHUGzuMEmlCizMzUT0loWPtLvco/wqcyX0YdEBRk3Qhr6otG+n/MWHkZXsZhm2jekmRAUdESgCH7sZ+CPCh6rhGDXGuetxPhzPJ3xCn8TKT0adxgZUlcjQ9a420qPA3QTcL1EpTGoRGW2Sa64MfeGcpbPXRsRL9qUc9GbuPCaXjqa5TySGahGcRpnj88rrvYTqUYnFSyGXMMrAa0nMuE3xKt9hkPrwYUtmUzl54zHrergPCX5FN5jFuw0sgOdcBM0RJA1+jTGyDCRnFMLGypsmqySRoaT0b6xjTWOOwnI+oKYpmbKK4kVUkW8/38Z1KIsb9zxJohQMglgEiAv0kjTB7ldSIeQUPcmadAFLTS yPnh03pl 4RYf/rWrx15rpqQOK5ItpqPrP8CPNSqUm/EUhA8w5QzAkLo9Sqf0QPbD0jnjAJ7RQEsMPVrpIPsYJ3gBC5h/VraUC/qBClETif12AWOTFDMRutLYpIbVsNsHgrflVII60YUd5vLIGcccd6O9dyPn6iyG+yIXxT74vaY6LTeHL4tkKuwxsRoVlNL871k/NW1mffuXXMzDmcDBn8s3MGmdSZMEj4ZctF5EQ9mp+ABb/a7zubxUz2504qu8KpMh5+G4/zMA46NqJqQutySZmUB9GOD2Q1td7vFfM7Yoq8hYKb12og6a6JSFC9PvUrSts4j1fO+zxKBsCxxS6EyaF7qDrTEo9USgR5W5N9h1NYfDXbQLfrzBe7IyDuZXWD0nRiYmpNeF6Tuq4DBVN8krsEd/p0bjUGQ6dILXLzVUqvMWrJbS47h8fY8NKTNOAadsXMdNDRUZRQ4GJm72h+wygQH2tucoxoKw/grWwFXLJNzW46a8kxyHkPcdYzprXY98AaGl22g+J5Rpq2hZ1VzmARh1/Cj4ZlYQCEatRI3cZhQ04By5lSIIcctRidt/lDORqWaf/hrDEXrQW6j3XN40PQUkActlQpuiCW+Hi+KwNyth0rXe2PrWJIgVGT/KJjGIOQ4rCnNo8LC3lXINTtR0867xjJjKW0RB2hpOvFXkHRAsfZDbZf2wSFRPphgm9gfAplRav5xHCe8A11gPlmeA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000054, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 28, 2025, Fuad Tabba wrote: > @@ -389,22 +381,211 @@ static void kvm_gmem_init_mount(void) > } > > #ifdef CONFIG_KVM_GMEM_SHARED_MEM > -static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) > +/* > + * An enum of the valid folio sharing states: > + * Bit 0: set if not shared with the guest (guest cannot fault it in) > + * Bit 1: set if not shared with the host (host cannot fault it in) > + */ > +enum folio_shareability { > + KVM_GMEM_ALL_SHARED = 0b00, /* Shared with the host and the guest. */ > + KVM_GMEM_GUEST_SHARED = 0b10, /* Shared only with the guest. */ > + KVM_GMEM_NONE_SHARED = 0b11, /* Not shared, transient state. */ Absolutely not. The proper way to define bitmasks is to use BIT(xxx). Based on past discussions, I suspect you went this route so that the most common value is '0' to avoid extra, but that should be an implementation detail buried deep in the low level xarray handling, not a The name is also bizarre and confusing. To map memory into the guest as private, it needs to be in KVM_GMEM_GUEST_SHARED. That's completely unworkable. Of course, it's not at all obvious that you're actually trying to create a bitmask. The above looks like an inverted bitmask, but then it's used as if the values don't matter. return (r == KVM_GMEM_ALL_SHARED || r == KVM_GMEM_GUEST_SHARED); Given that I can't think of a sane use case for allowing guest_memfd to be mapped into the host but not the guest (modulo temporary demand paging scenarios), I think all we need is: KVM_GMEM_SHARED = BIT(0), KVM_GMEM_INVALID = BIT(1), As for optimizing xarray storage, assuming it's actually a bitmask, simply let KVM specify which bits to invert when storing/loading to/from the xarray so that KVM can optimize storage for the most common value (which is presumably KVM_GEM_SHARED on arm64?). If KVM_GMEM_SHARED is the desired "default", invert bit 0, otherwise dont. If for some reason we get to a state where the default value is multiple bits, the inversion trick still works. E.g. if KVM_GMEM_SHARED where a composite value, then invert bits 0 and 1. The polarity shenanigans should be easy to hide in two low level macros/helpers. > +/* > + * Returns true if the folio is shared with the host and the guest. This is a superfluous comment. Simple predicates should be self-explanatory based on function name alone. > + * > + * Must be called with the offsets_lock lock held. Drop these types of comments and document through code, i.e. via lockdep assertions (which you already have). > + */ > +static bool kvm_gmem_offset_is_shared(struct inode *inode, pgoff_t index) > +{ > + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; > + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; > + unsigned long r; > + > + lockdep_assert_held(offsets_lock); > > - /* For now, VMs that support shared memory share all their memory. */ > - return kvm_arch_gmem_supports_shared_mem(gmem->kvm); > + r = xa_to_value(xa_load(shared_offsets, index)); > + > + return r == KVM_GMEM_ALL_SHARED; > +} > + > +/* > + * Returns true if the folio is shared with the guest (not transitioning). > + * > + * Must be called with the offsets_lock lock held. See above. > static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) This should be something like kvm_gmem_fault_shared() make it abundantly clear what's being done. Because it too me a few looks to realize this is faulting memory into host userspace, not into the guest.