Re: [PATCH v9 5/6] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com, david@redhat.com
Cc: maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, will@kernel.org, ryan.roberts@arm.com,
	shahuang@redhat.com, lpieralisi@kernel.org, ddutile@redhat.com,
	seanjc@google.com, aniketa@nvidia.com, cjia@nvidia.com,
	kwankhede@nvidia.com, kjaju@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com, zhiw@nvidia.com,
	mochs@nvidia.com, udhoke@nvidia.com, dnigam@nvidia.com,
	alex.williamson@redhat.com, sebastianene@google.com,
	coltonlewis@google.com, kevin.tian@intel.com, yi.l.liu@intel.com,
	ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com,
	linux-mm@kvack.org, tabba@google.com, qperret@google.com,
	kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, maobibo@loongson.cn
Subject: Re: [PATCH v9 5/6] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
Date: Fri, 4 Jul 2025 11:04:31 -0300	[thread overview]
Message-ID: <20250704140431.GH1410929@nvidia.com> (raw)
In-Reply-To: <20250621042111.3992-6-ankita@nvidia.com>

On Sat, Jun 21, 2025 at 04:21:10AM +0000, ankita@nvidia.com wrote:
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1681,18 +1681,53 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	if (is_error_noslot_pfn(pfn))
>  		return -EFAULT;
>  
> +	/*
> +	 * Check if this is non-struct page memory PFN, and cannot support
> +	 * CMOs. It could potentially be unsafe to access as cachable.
> +	 */
>  	if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP) && !pfn_is_map_memory(pfn)) {
>  		/*
> -		 * If the page was identified as device early by looking at
> -		 * the VMA flags, vma_pagesize is already representing the
> -		 * largest quantity we can map.  If instead it was mapped
> -		 * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE
> -		 * and must not be upgraded.
> -		 *
> -		 * In both cases, we don't let transparent_hugepage_adjust()
> -		 * change things at the last minute.
> +		 * COW VM_PFNMAP is possible when doing a MAP_PRIVATE
> +		 * /dev/mem mapping on systems that allow such mapping.
> +		 * Reject such case.
>  		 */
> -		s2_force_noncacheable = true;
> +		if (is_cow_mapping(vm_flags))
> +			return -EINVAL;

I still would like an explanation why we need to block this.

COW PFNMAP is like MIXEDMAP, you end up with a VMA where there is a
mixture of MMIO and normal pages. Arguably you are supposed to use
vm_normal_page() not pfn_is_map_memory(), but that seems difficult for
KVM.

Given we exclude the cachable case with the pfn_is_map_memory() we
know this is the non-struct page memory already, so why do we need to
block the COW?

I think the basic rule we are going for is that within the VMA the
non-normal/special PTE have to follow the vma->vm_pgprot while the
normal pages have to be cachable.

So if we find a normal page (ie pfn_is_map_memory()) then we know it
is cachable and s2_force_noncacheable = false. Otherwise we use the
vm_pgprot to decide if the special PTE is cachable.

David can you think of any reason to have this is_cow_mapping() test?

> +		if (is_vma_cacheable) {
> +			/*
> +			 * Whilst the VMA owner expects cacheable mapping to this
> +			 * PFN, hardware also has to support the FWB and CACHE DIC
> +			 * features.
> +			 *
> +			 * ARM64 KVM relies on kernel VA mapping to the PFN to
> +			 * perform cache maintenance as the CMO instructions work on
> +			 * virtual addresses. VM_PFNMAP region are not necessarily
> +			 * mapped to a KVA and hence the presence of hardware features
> +			 * S2FWB and CACHE DIC are mandatory for cache maintenance.
> +			 *
> +			 * Check if the hardware supports it before allowing the VMA
> +			 * owner request for cacheable mapping.
> +			 */
> +			if (!kvm_arch_supports_cacheable_pfnmap())
> +				return -EFAULT;
> +
> +			/* Cannot degrade cachable to non cachable */
> +			if (s2_force_noncacheable)
> +				return -EINVAL;

What am I missing? After the whole series is applied this is the first
reference to s2_force_noncacheable after it is initialized to
false. So this can't happen?

> +		} else {
> +			/*
> +			 * If the page was identified as device early by looking at
> +			 * the VMA flags, vma_pagesize is already representing the
> +			 * largest quantity we can map.  If instead it was mapped
> +			 * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE
> +			 * and must not be upgraded.
> +			 *
> +			 * In both cases, we don't let transparent_hugepage_adjust()
> +			 * change things at the last minute.
> +			 */
> +			s2_force_noncacheable = true;
> +		}


Then this logic that immediately follows:

        if (is_vma_cacheable && s2_force_noncacheable)
                return -EINVAL;

Doesn't make alot of sense either, the only cases that set
s2_force_noncacheable=true are the else block of 'if (is_vma_cacheable)'
so this is dead code too.

Seems like this still needs some cleanup to remove these impossible
conditions. The logic make sense to me otherwise though.

Jason

next prev parent reply	other threads:[~2025-07-04 14:04 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-21  4:21 [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable ankita
2025-06-21  4:21 ` [PATCH v9 1/6] KVM: arm64: Rename the device variable to s2_force_noncacheable ankita
2025-07-04 13:41   ` Jason Gunthorpe
2025-07-04 13:57   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 2/6] KVM: arm64: Update the check to detect device memory ankita
2025-07-04 13:43   ` Jason Gunthorpe
2025-07-04 14:02   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 3/6] KVM: arm64: Block cacheable PFNMAP mapping ankita
2025-06-27 13:49   ` Will Deacon
2025-06-30  1:56     ` Ankit Agrawal
2025-06-30 12:25       ` Jason Gunthorpe
2025-07-04 12:21         ` David Hildenbrand
2025-07-04 16:04           ` Will Deacon
2025-07-04 16:47             ` Jason Gunthorpe
2025-07-08 12:47               ` Will Deacon
2025-07-04 13:45   ` Jason Gunthorpe
2025-07-04 14:09   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 4/6] KVM: arm64: New function to determine hardware cache management support ankita
2025-07-04 13:47   ` Jason Gunthorpe
2025-07-04 14:10   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 5/6] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
2025-07-04 14:04   ` Jason Gunthorpe [this message]
2025-07-04 14:13     ` David Hildenbrand
2025-07-04 16:51       ` Ankit Agrawal
2025-06-21  4:21 ` [PATCH v9 6/6] KVM: arm64: Expose new KVM cap for cacheable PFNMAP ankita
2025-07-04 13:44   ` Jason Gunthorpe
2025-07-04 14:15     ` David Hildenbrand
2025-07-04 15:04       ` Jason Gunthorpe
2025-07-04 16:20         ` Ankit Agrawal
2025-07-04 16:56           ` Jason Gunthorpe
2025-06-27  5:03 ` [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable Ankit Agrawal
2025-07-02  9:33 ` Ankit Agrawal
2025-07-02 16:51   ` Donald Dutile

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250704140431.GH1410929@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=coltonlewis@google.com \
    --cc=danw@nvidia.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dnigam@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joey.gouly@arm.com \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=mochs@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=qperret@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=seanjc@google.com \
    --cc=sebastianene@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=targupta@nvidia.com \
    --cc=udhoke@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox