linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: ankita@nvidia.com, david@redhat.com
Cc: maz@kernel.org, oliver.upton@linux.dev, joey.gouly@arm.com,
	suzuki.poulose@arm.com, yuzenghui@huawei.com,
	catalin.marinas@arm.com, will@kernel.org, ryan.roberts@arm.com,
	shahuang@redhat.com, lpieralisi@kernel.org, ddutile@redhat.com,
	seanjc@google.com, aniketa@nvidia.com, cjia@nvidia.com,
	kwankhede@nvidia.com, kjaju@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com, zhiw@nvidia.com,
	mochs@nvidia.com, udhoke@nvidia.com, dnigam@nvidia.com,
	alex.williamson@redhat.com, sebastianene@google.com,
	coltonlewis@google.com, kevin.tian@intel.com, yi.l.liu@intel.com,
	ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com,
	linux-mm@kvack.org, tabba@google.com, qperret@google.com,
	kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, maobibo@loongson.cn
Subject: Re: [PATCH v9 5/6] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
Date: Fri, 4 Jul 2025 11:04:31 -0300	[thread overview]
Message-ID: <20250704140431.GH1410929@nvidia.com> (raw)
In-Reply-To: <20250621042111.3992-6-ankita@nvidia.com>

On Sat, Jun 21, 2025 at 04:21:10AM +0000, ankita@nvidia.com wrote:
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1681,18 +1681,53 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	if (is_error_noslot_pfn(pfn))
>  		return -EFAULT;
>  
> +	/*
> +	 * Check if this is non-struct page memory PFN, and cannot support
> +	 * CMOs. It could potentially be unsafe to access as cachable.
> +	 */
>  	if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP) && !pfn_is_map_memory(pfn)) {
>  		/*
> -		 * If the page was identified as device early by looking at
> -		 * the VMA flags, vma_pagesize is already representing the
> -		 * largest quantity we can map.  If instead it was mapped
> -		 * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE
> -		 * and must not be upgraded.
> -		 *
> -		 * In both cases, we don't let transparent_hugepage_adjust()
> -		 * change things at the last minute.
> +		 * COW VM_PFNMAP is possible when doing a MAP_PRIVATE
> +		 * /dev/mem mapping on systems that allow such mapping.
> +		 * Reject such case.
>  		 */
> -		s2_force_noncacheable = true;
> +		if (is_cow_mapping(vm_flags))
> +			return -EINVAL;

I still would like an explanation why we need to block this.

COW PFNMAP is like MIXEDMAP, you end up with a VMA where there is a
mixture of MMIO and normal pages. Arguably you are supposed to use
vm_normal_page() not pfn_is_map_memory(), but that seems difficult for
KVM.

Given we exclude the cachable case with the pfn_is_map_memory() we
know this is the non-struct page memory already, so why do we need to
block the COW?

I think the basic rule we are going for is that within the VMA the
non-normal/special PTE have to follow the vma->vm_pgprot while the
normal pages have to be cachable.

So if we find a normal page (ie pfn_is_map_memory()) then we know it
is cachable and s2_force_noncacheable = false. Otherwise we use the
vm_pgprot to decide if the special PTE is cachable.

David can you think of any reason to have this is_cow_mapping() test?

> +		if (is_vma_cacheable) {
> +			/*
> +			 * Whilst the VMA owner expects cacheable mapping to this
> +			 * PFN, hardware also has to support the FWB and CACHE DIC
> +			 * features.
> +			 *
> +			 * ARM64 KVM relies on kernel VA mapping to the PFN to
> +			 * perform cache maintenance as the CMO instructions work on
> +			 * virtual addresses. VM_PFNMAP region are not necessarily
> +			 * mapped to a KVA and hence the presence of hardware features
> +			 * S2FWB and CACHE DIC are mandatory for cache maintenance.
> +			 *
> +			 * Check if the hardware supports it before allowing the VMA
> +			 * owner request for cacheable mapping.
> +			 */
> +			if (!kvm_arch_supports_cacheable_pfnmap())
> +				return -EFAULT;
> +
> +			/* Cannot degrade cachable to non cachable */
> +			if (s2_force_noncacheable)
> +				return -EINVAL;

What am I missing? After the whole series is applied this is the first
reference to s2_force_noncacheable after it is initialized to
false. So this can't happen?

> +		} else {
> +			/*
> +			 * If the page was identified as device early by looking at
> +			 * the VMA flags, vma_pagesize is already representing the
> +			 * largest quantity we can map.  If instead it was mapped
> +			 * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE
> +			 * and must not be upgraded.
> +			 *
> +			 * In both cases, we don't let transparent_hugepage_adjust()
> +			 * change things at the last minute.
> +			 */
> +			s2_force_noncacheable = true;
> +		}


Then this logic that immediately follows:

        if (is_vma_cacheable && s2_force_noncacheable)
                return -EINVAL;

Doesn't make alot of sense either, the only cases that set
s2_force_noncacheable=true are the else block of 'if (is_vma_cacheable)'
so this is dead code too.

Seems like this still needs some cleanup to remove these impossible
conditions. The logic make sense to me otherwise though.

Jason


  reply	other threads:[~2025-07-04 14:04 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-21  4:21 [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable ankita
2025-06-21  4:21 ` [PATCH v9 1/6] KVM: arm64: Rename the device variable to s2_force_noncacheable ankita
2025-07-04 13:41   ` Jason Gunthorpe
2025-07-04 13:57   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 2/6] KVM: arm64: Update the check to detect device memory ankita
2025-07-04 13:43   ` Jason Gunthorpe
2025-07-04 14:02   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 3/6] KVM: arm64: Block cacheable PFNMAP mapping ankita
2025-06-27 13:49   ` Will Deacon
2025-06-30  1:56     ` Ankit Agrawal
2025-06-30 12:25       ` Jason Gunthorpe
2025-07-04 12:21         ` David Hildenbrand
2025-07-04 16:04           ` Will Deacon
2025-07-04 16:47             ` Jason Gunthorpe
2025-07-08 12:47               ` Will Deacon
2025-07-04 13:45   ` Jason Gunthorpe
2025-07-04 14:09   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 4/6] KVM: arm64: New function to determine hardware cache management support ankita
2025-07-04 13:47   ` Jason Gunthorpe
2025-07-04 14:10   ` David Hildenbrand
2025-06-21  4:21 ` [PATCH v9 5/6] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
2025-07-04 14:04   ` Jason Gunthorpe [this message]
2025-07-04 14:13     ` David Hildenbrand
2025-07-04 16:51       ` Ankit Agrawal
2025-06-21  4:21 ` [PATCH v9 6/6] KVM: arm64: Expose new KVM cap for cacheable PFNMAP ankita
2025-07-04 13:44   ` Jason Gunthorpe
2025-07-04 14:15     ` David Hildenbrand
2025-07-04 15:04       ` Jason Gunthorpe
2025-07-04 16:20         ` Ankit Agrawal
2025-07-04 16:56           ` Jason Gunthorpe
2025-06-27  5:03 ` [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable Ankit Agrawal
2025-07-02  9:33 ` Ankit Agrawal
2025-07-02 16:51   ` Donald Dutile

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250704140431.GH1410929@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=coltonlewis@google.com \
    --cc=danw@nvidia.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dnigam@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joey.gouly@arm.com \
    --cc=kevin.tian@intel.com \
    --cc=kjaju@nvidia.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lpieralisi@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=mochs@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=qperret@google.com \
    --cc=ryan.roberts@arm.com \
    --cc=seanjc@google.com \
    --cc=sebastianene@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=targupta@nvidia.com \
    --cc=udhoke@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    --cc=yuzenghui@huawei.com \
    --cc=zhiw@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox