From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>,
Matthew Wilcox <willy@infradead.org>, Guo Ren <guoren@kernel.org>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Alexander Gordeev <agordeev@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
"David S . Miller" <davem@davemloft.net>,
Andreas Larsson <andreas@gaisler.com>,
Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Nicolas Pitre <nico@fluxnic.net>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@redhat.com>,
Konstantin Komarov <almaz.alexandrovich@paragon-software.com>,
Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
Dave Young <dyoung@redhat.com>, Tony Luck <tony.luck@intel.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Dave Martin <Dave.Martin@arm.com>,
James Morse <james.morse@arm.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>, Hugh Dickins <hughd@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
Uladzislau Rezki <urezki@gmail.com>,
Dmitry Vyukov <dvyukov@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Jann Horn <jannh@google.com>, Pedro Falcato <pfalcato@suse.de>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-csky@vger.kernel.org,
linux-mips@vger.kernel.org, linux-s390@vger.kernel.org,
sparclinux@vger.kernel.org, nvdimm@lists.linux.dev,
linux-cxl@vger.kernel.org, linux-mm@kvack.org,
ntfs3@lists.linux.dev, kexec@lists.infradead.org,
kasan-dev@googlegroups.com, Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: [PATCH 00/16] expand mmap_prepare functionality, port more users
Date: Mon, 8 Sep 2025 15:48:36 +0100 [thread overview]
Message-ID: <9b463af0-3f29-4816-bd5d-caa282b1a9cd@lucifer.local> (raw)
In-Reply-To: <tyoifr2ym3pzx4nwqhdwap57us3msusbsmql7do4pim5ku7qtm@wjyvh5bs633s>
On Mon, Sep 08, 2025 at 03:27:52PM +0200, Jan Kara wrote:
> Hi Lorenzo!
Hey! :)
> > After updating some areas that can simply use mmap_prepare as-is, and
> > performing some housekeeping, we then introduce two new hooks:
> >
> > f_op->mmap_complete - this is invoked at the point of the VMA having been
> > correctly inserted, though with the VMA write lock still held. mmap_prepare
> > must also be specified.
> >
> > This expands the use of mmap_prepare to those callers which need to
> > prepopulate mappings, as well as any which does genuinely require access to
> > the VMA.
> >
> > It's simple - we will let the caller access the VMA, but only once it's
> > established. At this point unwinding issues is simple - we just unmap the
> > VMA.
> >
> > The VMA is also then correctly initialised at this stage so there can be no
> > issues arising from a not-fully initialised VMA at this point.
> >
> > The other newly added hook is:
> >
> > f_op->mmap_abort - this is only valid in conjunction with mmap_prepare and
> > mmap_complete. This is called should an error arise between mmap_prepare
> > and mmap_complete (not as a result of mmap_prepare but rather some other
> > part of the mapping logic).
> >
> > This is required in case mmap_prepare wishes to establish state or locks
> > which need to be cleaned up on completion. If we did not provide this, then
> > this could not be permitted as this cleanup would otherwise not occur
> > should the mapping fail between the two calls.
>
> So seeing these new hooks makes me wonder: Shouldn't rather implement
> mmap(2) in a way more similar to how other f_op hooks behave like ->read or
> ->write? I.e., a hook called at rather high level - something like from
> vm_mmap_pgoff() or similar similar level - which would just call library
> functions from MM for the stuff it needs to do. Filesystems would just do
> their checks and call the generic mmap function with the vm_ops they want
> to use, more complex users could then fill in the VMA before releasing
> mmap_lock or do cleanup in case of failure... This would seem like a more
> understandable API than several hooks with rules when what gets called.
We can't just do everything at this level, because we need:
a. Information to actually know how to map the VMA before putting it in the
maple tree.
b. Once it's there, anything else we need to do (typically - prepopulate).
The crux of this change is to avoid horrors around the VMA being passed
around not yet being properly initialised, and yet being accessible for
drivers to do 'whatever' with.
Ideally we'd have only one case, and for _nearly all_ filesystems this is
how it is actually.
But sadly some _do need_ to do extra work afterwards, most notably,
prepopulation.
Cheers, Lorenzo
next prev parent reply other threads:[~2025-09-08 14:49 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 11:10 Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 01/16] mm/shmem: update shmem to use mmap_prepare Lorenzo Stoakes
2025-09-08 14:59 ` David Hildenbrand
2025-09-08 15:28 ` Lorenzo Stoakes
2025-09-09 3:19 ` Baolin Wang
2025-09-09 9:08 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 02/16] device/dax: update devdax " Lorenzo Stoakes
2025-09-08 15:03 ` David Hildenbrand
2025-09-08 15:28 ` Lorenzo Stoakes
2025-09-08 15:31 ` David Hildenbrand
2025-09-08 11:10 ` [PATCH 03/16] mm: add vma_desc_size(), vma_desc_pages() helpers Lorenzo Stoakes
2025-09-08 12:51 ` Jason Gunthorpe
2025-09-08 13:12 ` Lorenzo Stoakes
2025-09-08 13:32 ` Jason Gunthorpe
2025-09-08 14:09 ` Lorenzo Stoakes
2025-09-08 14:20 ` Jason Gunthorpe
2025-09-08 14:47 ` Lorenzo Stoakes
2025-09-08 15:07 ` David Hildenbrand
2025-09-08 15:35 ` Lorenzo Stoakes
2025-09-08 17:30 ` David Hildenbrand
2025-09-09 9:21 ` Lorenzo Stoakes
2025-09-08 15:16 ` Jason Gunthorpe
2025-09-08 15:24 ` David Hildenbrand
2025-09-08 15:33 ` Jason Gunthorpe
2025-09-08 15:46 ` David Hildenbrand
2025-09-08 15:50 ` David Hildenbrand
2025-09-08 15:56 ` Jason Gunthorpe
2025-09-08 17:36 ` David Hildenbrand
2025-09-08 20:24 ` Lorenzo Stoakes
2025-09-08 15:33 ` Lorenzo Stoakes
2025-09-08 15:10 ` David Hildenbrand
2025-09-08 11:10 ` [PATCH 04/16] relay: update relay to use mmap_prepare Lorenzo Stoakes
2025-09-08 15:15 ` David Hildenbrand
2025-09-08 15:29 ` Lorenzo Stoakes
2025-09-09 4:09 ` kernel test robot
2025-09-09 9:00 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 05/16] mm/vma: rename mmap internal functions to avoid confusion Lorenzo Stoakes
2025-09-08 15:19 ` David Hildenbrand
2025-09-08 15:31 ` Lorenzo Stoakes
2025-09-08 17:38 ` David Hildenbrand
2025-09-09 9:04 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 06/16] mm: introduce the f_op->mmap_complete, mmap_abort hooks Lorenzo Stoakes
2025-09-08 12:55 ` Jason Gunthorpe
2025-09-08 13:19 ` Lorenzo Stoakes
2025-09-08 15:27 ` David Hildenbrand
2025-09-09 9:13 ` Lorenzo Stoakes
2025-09-09 9:26 ` David Hildenbrand
2025-09-09 9:37 ` Lorenzo Stoakes
2025-09-09 16:43 ` Suren Baghdasaryan
2025-09-09 17:36 ` Lorenzo Stoakes
2025-09-09 16:44 ` Suren Baghdasaryan
2025-09-08 11:10 ` [PATCH 07/16] doc: update porting, vfs documentation for mmap_[complete, abort] Lorenzo Stoakes
2025-09-08 23:17 ` Randy Dunlap
2025-09-09 9:02 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 08/16] mm: add remap_pfn_range_prepare(), remap_pfn_range_complete() Lorenzo Stoakes
2025-09-08 13:00 ` Jason Gunthorpe
2025-09-08 13:27 ` Lorenzo Stoakes
2025-09-08 13:35 ` Jason Gunthorpe
2025-09-08 14:18 ` Lorenzo Stoakes
2025-09-08 16:03 ` Jason Gunthorpe
2025-09-08 16:07 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 09/16] mm: introduce io_remap_pfn_range_prepare, complete Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 10/16] mm/hugetlb: update hugetlbfs to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:11 ` Jason Gunthorpe
2025-09-08 13:37 ` Lorenzo Stoakes
2025-09-08 13:52 ` Jason Gunthorpe
2025-09-08 14:19 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 11/16] mm: update mem char driver " Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort Lorenzo Stoakes
2025-09-08 13:24 ` Jason Gunthorpe
2025-09-08 13:40 ` Lorenzo Stoakes
2025-09-08 14:27 ` Lorenzo Stoakes
2025-09-09 3:26 ` kernel test robot
2025-09-09 9:27 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 13/16] mm: update cramfs to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:27 ` Jason Gunthorpe
2025-09-08 13:44 ` Lorenzo Stoakes
2025-09-09 9:41 ` kernel test robot
2025-09-09 9:51 ` Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 14/16] fs/proc: add proc_mmap_[prepare, complete] hooks for procfs Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 15/16] fs/proc: update vmcore to use .proc_mmap_[prepare, complete] Lorenzo Stoakes
2025-09-08 11:10 ` [PATCH 16/16] kcov: update kcov to use mmap_prepare, mmap_complete Lorenzo Stoakes
2025-09-08 13:30 ` Jason Gunthorpe
2025-09-08 13:47 ` Lorenzo Stoakes
2025-09-08 13:27 ` [PATCH 00/16] expand mmap_prepare functionality, port more users Jan Kara
2025-09-08 14:48 ` Lorenzo Stoakes [this message]
2025-09-08 15:04 ` Jason Gunthorpe
2025-09-08 15:15 ` Lorenzo Stoakes
2025-09-09 8:31 ` Alexander Gordeev
2025-09-09 8:59 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b463af0-3f29-4816-bd5d-caa282b1a9cd@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Dave.Martin@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=almaz.alexandrovich@paragon-software.com \
--cc=andreas@gaisler.com \
--cc=andreyknvl@gmail.com \
--cc=arnd@arndb.de \
--cc=baolin.wang@linux.alibaba.com \
--cc=bhe@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=dvyukov@google.com \
--cc=dyoung@redhat.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=guoren@kernel.org \
--cc=hca@linux.ibm.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=james.morse@arm.com \
--cc=jannh@google.com \
--cc=jgg@nvidia.com \
--cc=kasan-dev@googlegroups.com \
--cc=kexec@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=nico@fluxnic.net \
--cc=ntfs3@lists.linux.dev \
--cc=nvdimm@lists.linux.dev \
--cc=osalvador@suse.de \
--cc=pfalcato@suse.de \
--cc=reinette.chatre@intel.com \
--cc=rppt@kernel.org \
--cc=sparclinux@vger.kernel.org \
--cc=surenb@google.com \
--cc=svens@linux.ibm.com \
--cc=tony.luck@intel.com \
--cc=tsbogend@alpha.franken.de \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
--cc=vgoyal@redhat.com \
--cc=viro@zeniv.linux.org.uk \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox