linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Matthew Wilcox <willy@infradead.org>, Guo Ren <guoren@kernel.org>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	"David S . Miller" <davem@davemloft.net>,
	Andreas Larsson <andreas@gaisler.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Nicolas Pitre <nico@fluxnic.net>,
	Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>,
	David Hildenbrand <david@redhat.com>,
	Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
	Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Dave Young <dyoung@redhat.com>, Tony Luck <tony.luck@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Dave Martin <Dave.Martin@arm.com>,
	James Morse <james.morse@arm.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrey Konovalov <andreyknvl@gmail.com>,
	Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-csky@vger.kernel.org,
	linux-mips@vger.kernel.org, linux-s390@vger.kernel.org,
	sparclinux@vger.kernel.org, nvdimm@lists.linux.dev,
	linux-cxl@vger.kernel.org, linux-mm@kvack.org,
	ntfs3@lists.linux.dev, kexec@lists.infradead.org,
	kasan-dev@googlegroups.com, iommu@lists.linux.dev,
	Kevin Tian <kevin.tian@intel.com>, Will Deacon <will@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH v4 09/14] mm: add ability to take further action in vm_area_desc
Date: Thu, 18 Sep 2025 07:09:28 +0100	[thread overview]
Message-ID: <df1c197d-ff38-40e9-8466-829bc5d4e642@lucifer.local> (raw)
In-Reply-To: <20250917213737.GH1391379@nvidia.com>

On Wed, Sep 17, 2025 at 06:37:37PM -0300, Jason Gunthorpe wrote:
> On Wed, Sep 17, 2025 at 08:11:11PM +0100, Lorenzo Stoakes wrote:
> > +static int mmap_action_finish(struct mmap_action *action,
> > +		const struct vm_area_struct *vma, int err)
> > +{
> > +	/*
> > +	 * If an error occurs, unmap the VMA altogether and return an error. We
> > +	 * only clear the newly allocated VMA, since this function is only
> > +	 * invoked if we do NOT merge, so we only clean up the VMA we created.
> > +	 */
> > +	if (err) {
> > +		const size_t len = vma_pages(vma) << PAGE_SHIFT;
> > +
> > +		do_munmap(current->mm, vma->vm_start, len, NULL);
> > +
> > +		if (action->error_hook) {
> > +			/* We may want to filter the error. */
> > +			err = action->error_hook(err);
> > +
> > +			/* The caller should not clear the error. */
> > +			VM_WARN_ON_ONCE(!err);
> > +		}
> > +		return err;
> > +	}
> > +
> > +	if (action->success_hook)
> > +		return action->success_hook(vma);
>
> I thought you were going to use a single hook function as was
> suggested?
>
> return action->finish_hook(vma, err);

Err, no? I said no to this suggestion from Pedro? I don't like it.

In practice I've found callers need to EITHER do something on success or
filter errors. I think it's more expressive this way.

I also think you make it more likely that a driver will get things wrong if
they intend only to do something on success and you have an 'err'
parameter.

>
> > +int mmap_action_complete(struct mmap_action *action,
> > +			struct vm_area_struct *vma)
> > +{
> > +	switch (action->type) {
> > +	case MMAP_NOTHING:
> > +		break;
> > +	case MMAP_REMAP_PFN:
> > +	case MMAP_IO_REMAP_PFN:
> > +		WARN_ON_ONCE(1); /* nommu cannot handle this. */
>
> This should be:
>
>      if (WARN_ON_ONCE(true))
>          err = -EINVAL
>
> To abort the thing and try to recover.

'Try to recover'... how exactly...

It'd be a serious programmatic kernel bug so I'm not sure going out of our way
to error out here is brilliantly valuable. You might even mask the bug this way,
because the mmap() will just fail instad of nuking the process on fault.

But fine, let me send a fix-patch...

>
> > diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> > index 07167446dcf4..22ed38e8714e 100644
> > --- a/tools/testing/vma/vma_internal.h
> > +++ b/tools/testing/vma/vma_internal.h
> > @@ -274,6 +274,49 @@ struct mm_struct {
> >
> >  struct vm_area_struct;
> >
> > +
> > +/* What action should be taken after an .mmap_prepare call is complete? */
> > +enum mmap_action_type {
> > +	MMAP_NOTHING,		/* Mapping is complete, no further action. */
> > +	MMAP_REMAP_PFN,		/* Remap PFN range. */
> > +};
> > +
> > +/*
> > + * Describes an action an mmap_prepare hook can instruct to be taken to complete
> > + * the mapping of a VMA. Specified in vm_area_desc.
> > + */
> > +struct mmap_action {
> > +	union {
> > +		/* Remap range. */
> > +		struct {
> > +			unsigned long start;
> > +			unsigned long start_pfn;
> > +			unsigned long size;
> > +			pgprot_t pgprot;
> > +		} remap;
> > +	};
> > +	enum mmap_action_type type;
> > +
> > +	/*
> > +	 * If specified, this hook is invoked after the selected action has been
> > +	 * successfully completed. Note that the VMA write lock still held.
> > +	 *
> > +	 * The absolute minimum ought to be done here.
> > +	 *
> > +	 * Returns 0 on success, or an error code.
> > +	 */
> > +	int (*success_hook)(const struct vm_area_struct *vma);
> > +
> > +	/*
> > +	 * If specified, this hook is invoked when an error occurred when
> > +	 * attempting the selection action.
> > +	 *
> > +	 * The hook can return an error code in order to filter the error, but
> > +	 * it is not valid to clear the error here.
> > +	 */
> > +	int (*error_hook)(int err);
> > +};
>
> I didn't try to understand what vma_internal.h is for, but should this
> block be an exact copy of the normal one? ie MMAP_IO_REMAP_PFN is missing?

Right. Of course. I'll include that in the fix-patch...

>
> Jason


  reply	other threads:[~2025-09-18  6:09 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17 19:11 [PATCH v4 00/14] expand mmap_prepare functionality, port more users Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 01/14] mm/shmem: update shmem to use mmap_prepare Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 02/14] device/dax: update devdax " Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 03/14] mm: add vma_desc_size(), vma_desc_pages() helpers Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 04/14] relay: update relay to use mmap_prepare Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 05/14] mm/vma: rename __mmap_prepare() function to avoid confusion Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 06/14] mm: add remap_pfn_range_prepare(), remap_pfn_range_complete() Lorenzo Stoakes
2025-09-17 21:32   ` Jason Gunthorpe
2025-09-18  6:09     ` Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 07/14] mm: abstract io_remap_pfn_range() based on PFN Lorenzo Stoakes
2025-09-17 21:19   ` Jason Gunthorpe
2025-09-18  6:26     ` Lorenzo Stoakes
2025-09-18  9:11   ` Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 08/14] mm: introduce io_remap_pfn_range_[prepare, complete]() Lorenzo Stoakes
2025-09-18  9:12   ` Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 09/14] mm: add ability to take further action in vm_area_desc Lorenzo Stoakes
2025-09-17 21:37   ` Jason Gunthorpe
2025-09-18  6:09     ` Lorenzo Stoakes [this message]
2025-09-18  9:14   ` Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 10/14] doc: update porting, vfs documentation for mmap_prepare actions Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 11/14] mm/hugetlbfs: update hugetlbfs to use mmap_prepare Lorenzo Stoakes
2025-09-23 11:52   ` Sumanth Korikkar
2025-09-23 21:17     ` Andrew Morton
2025-09-24 12:03       ` Lorenzo Stoakes
2025-10-17 12:27       ` Sumanth Korikkar
2025-10-17 12:46         ` Lorenzo Stoakes
2025-10-17 21:37           ` Andrew Morton
2025-10-20 10:58     ` Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 12/14] mm: add shmem_zero_setup_desc() Lorenzo Stoakes
2025-09-17 21:38   ` Jason Gunthorpe
2025-09-17 19:11 ` [PATCH v4 13/14] mm: update mem char driver to use mmap_prepare Lorenzo Stoakes
2025-09-17 19:11 ` [PATCH v4 14/14] mm: update resctl " Lorenzo Stoakes
2025-09-17 20:31 ` [PATCH v4 00/14] expand mmap_prepare functionality, port more users Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df1c197d-ff38-40e9-8466-829bc5d4e642@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Dave.Martin@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=almaz.alexandrovich@paragon-software.com \
    --cc=andreas@gaisler.com \
    --cc=andreyknvl@gmail.com \
    --cc=arnd@arndb.de \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bhe@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=dvyukov@google.com \
    --cc=dyoung@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=guoren@kernel.org \
    --cc=hca@linux.ibm.com \
    --cc=hughd@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kevin.tian@intel.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nico@fluxnic.net \
    --cc=ntfs3@lists.linux.dev \
    --cc=nvdimm@lists.linux.dev \
    --cc=osalvador@suse.de \
    --cc=pfalcato@suse.de \
    --cc=reinette.chatre@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=svens@linux.ibm.com \
    --cc=tony.luck@intel.com \
    --cc=tsbogend@alpha.franken.de \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vgoyal@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox