linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: Hou Tao <houtao@huaweicloud.com>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-mm@kvack.org, linux-nvme@lists.infradead.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	Alistair Popple <apopple@nvidia.com>,
	Leon Romanovsky <leonro@nvidia.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Tejun Heo <tj@kernel.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Danilo Krummrich <dakr@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	houtao1@huawei.com
Subject: Re: [PATCH 10/13] PCI/P2PDMA: support compound page in p2pmem_alloc_mmap()
Date: Wed, 7 Jan 2026 16:24:24 -0400	[thread overview]
Message-ID: <20260107202424.GC340082@ziepe.ca> (raw)
In-Reply-To: <07a785e5-5d2e-4c81-a834-1237c79fdd51@deltatee.com>

On Mon, Dec 22, 2025 at 10:04:17AM -0700, Logan Gunthorpe wrote:
> I would have expected this code to allocate an appropriately aligned
> block of the p2p memory based on the requirements of the current
> mapping, not based on alignment requirements established when the device
> is probed.

Yeah, I think this is not right too.

I think the flow has become confused by trying to set a static
vmemmap_shift when creating the pgmap. That is not how something like
this should work at all.

Instead the basic idea should be that each mmap systemcall will
determine what folio order it would like to have, it will allocate an
aligned range of physical from the genpool, and then it will alter the
folios in that range into a single high order folio.

Finally the high order folio is installed in one shot with the mm
dealing with placing it optimally in the right page table levels.

You could use a heuristic (ie I'm 2M size aligned or 1G size aligned)
or maybe use the MAP_HUGE_2M/MAP_HUGE_1G flags, or something else
perhaps.

Don't follow what DAX did, this doesn't have the limitations DAX had
to work with.

I also don't think drivers should be open coding the
vm_insert_folio_xx() stuff, the mm should have a helper to accept a
folio of any order, the VA and the phys, then install it optimally. So
don't export vm_insert_folio_pmd()/etc please.

Finally, Peter Xu has been working on the issue of setting the
alignment of VMAs when they will be used to hold large aligned folios,
that would help this be more useful by avoiding the need for MAP_FIXED:

https://lore.kernel.org/kvm/20251204151003.171039-1-peterx@redhat.com/

Assuming the folio size can be determined early enough in the VMA
process, though Lorenzo's recent refactorying here into mmap_prepare
may be helpful.

Jason


  parent reply	other threads:[~2026-01-07 20:24 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20  4:04 [PATCH 00/13] Enable compound page for p2pdma memory Hou Tao
2025-12-20  4:04 ` [PATCH 01/13] PCI/P2PDMA: Release the per-cpu ref of pgmap when vm_insert_page() fails Hou Tao
2025-12-22 16:49   ` Logan Gunthorpe
2026-01-08  3:23   ` Alistair Popple
2026-01-08 15:55     ` Bjorn Helgaas
2025-12-20  4:04 ` [PATCH 02/13] PCI/P2PDMA: Fix the warning condition in p2pmem_alloc_mmap() Hou Tao
2025-12-22 16:50   ` Logan Gunthorpe
2026-01-07 14:39     ` Christoph Hellwig
2026-01-07 17:17       ` Bjorn Helgaas
2026-01-07 20:34         ` Bjorn Helgaas
2026-01-08 10:17           ` Christoph Hellwig
2026-01-08  3:28   ` Alistair Popple
2025-12-20  4:04 ` [PATCH 03/13] kernfs: add support for get_unmapped_area callback Hou Tao
2025-12-20 15:43   ` kernel test robot
2025-12-20 15:57   ` kernel test robot
2025-12-20  4:04 ` [PATCH 04/13] kernfs: add support for may_split and pagesize callbacks Hou Tao
2025-12-20  4:04 ` [PATCH 05/13] sysfs: support get_unmapped_area callback for binary file Hou Tao
2025-12-20  4:04 ` [PATCH 06/13] PCI/P2PDMA: add align parameter for pci_p2pdma_add_resource() Hou Tao
2025-12-20  4:04 ` [PATCH 07/13] PCI/P2PDMA: create compound page for aligned p2pdma memory Hou Tao
2026-01-08  5:14   ` Alistair Popple
2025-12-20  4:04 ` [PATCH 08/13] mm/huge_memory: add helpers to insert huge page during mmap Hou Tao
2025-12-20  4:04 ` [PATCH 09/13] PCI/P2PDMA: support get_unmapped_area to return aligned vaddr Hou Tao
2025-12-20  4:04 ` [PATCH 10/13] PCI/P2PDMA: support compound page in p2pmem_alloc_mmap() Hou Tao
2025-12-22 17:04   ` Logan Gunthorpe
2025-12-24  2:20     ` Hou Tao
2026-01-05 17:24       ` Logan Gunthorpe
2026-01-07 20:24     ` Jason Gunthorpe [this message]
2026-01-07 21:22       ` Logan Gunthorpe
2026-01-08  5:20   ` Alistair Popple
2025-12-20  4:04 ` [PATCH 11/13] PCI/P2PDMA: add helper pci_p2pdma_max_pagemap_align() Hou Tao
2025-12-20  4:04 ` [PATCH 12/13] nvme-pci: introduce cmb_devmap_align module parameter Hou Tao
2025-12-20 22:22   ` kernel test robot
2025-12-20  4:04 ` [PATCH 13/13] PCI/P2PDMA: enable compound page support for p2pdma memory Hou Tao
2025-12-22 17:10   ` Logan Gunthorpe
2025-12-21 12:19 ` [PATCH 00/13] Enable compound page " Leon Romanovsky
     [not found]   ` <416b2575-f5e7-7faf-9e7c-6e9df170bf1a@huaweicloud.com>
2025-12-24  1:37     ` Hou Tao
2025-12-24  9:22       ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260107202424.GC340082@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=dakr@kernel.org \
    --cc=david@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=houtao1@huawei.com \
    --cc=houtao@huaweicloud.com \
    --cc=kbusch@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=rafael@kernel.org \
    --cc=sagi@grimberg.me \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox