From: Barry Song <21cnbao@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
"Barry Song" <v-songbaohua@oppo.com>,
"Russell King" <linux@armlinux.org.uk>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Will Deacon" <will@kernel.org>,
"Huacai Chen" <chenhuacai@kernel.org>,
"WANG Xuerui" <kernel@xen0n.name>,
"Madhavan Srinivasan" <maddy@linux.ibm.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Christophe Leroy" <christophe.leroy@csgroup.eu>,
"Paul Walmsley" <pjw@kernel.org>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
"Albert Ou" <aou@eecs.berkeley.edu>,
"Alexandre Ghiti" <alex@ghiti.fr>,
"Alexander Gordeev" <agordeev@linux.ibm.com>,
"Gerald Schaefer" <gerald.schaefer@linux.ibm.com>,
"Heiko Carstens" <hca@linux.ibm.com>,
"Vasily Gorbik" <gor@linux.ibm.com>,
"Christian Borntraeger" <borntraeger@linux.ibm.com>,
"Sven Schnelle" <svens@linux.ibm.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Andy Lutomirski" <luto@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>,
"Borislav Petkov" <bp@alien8.de>,
x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
"David Hildenbrand" <david@kernel.org>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Mike Rapoport" <rppt@kernel.org>,
"Suren Baghdasaryan" <surenb@google.com>,
"Michal Hocko" <mhocko@suse.com>,
"Pedro Falcato" <pfalcato@suse.de>,
"Jarkko Sakkinen" <jarkko@kernel.org>,
"Oscar Salvador" <osalvador@suse.de>,
"Kuninori Morimoto" <kuninori.morimoto.gx@renesas.com>,
"Oven Liyang" <liyangouwen1@oppo.com>,
"Mark Rutland" <mark.rutland@arm.com>,
"Ada Couprie Diaz" <ada.coupriediaz@arm.com>,
"Robin Murphy" <robin.murphy@arm.com>,
"Kristina Martšenko" <kristina.martsenko@arm.com>,
"Kevin Brodsky" <kevin.brodsky@arm.com>,
"Yeoreum Yun" <yeoreum.yun@arm.com>,
"Wentao Guan" <guanwentao@uniontech.com>,
"Thorsten Blum" <thorsten.blum@linux.dev>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Yunhui Cui" <cuiyunhui@bytedance.com>,
"Nam Cao" <namcao@linutronix.de>, "Chris Li" <chrisl@kernel.org>,
"Kairui Song" <kasong@tencent.com>,
"Kemeng Shi" <shikemeng@huaweicloud.com>,
"Nhat Pham" <nphamcs@gmail.com>, "Baoquan He" <bhe@redhat.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] mm: continue using per-VMA lock when retrying page faults after I/O
Date: Thu, 27 Nov 2025 12:22:16 +0800 [thread overview]
Message-ID: <CAGsJ_4zyZeLtxVe56OSYQx0OcjETw2ru1FjZjBOnTszMe_MW2g@mail.gmail.com> (raw)
In-Reply-To: <aSfO7fA-04SBtTug@casper.infradead.org>
On Thu, Nov 27, 2025 at 12:09 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Nov 27, 2025 at 09:14:36AM +0800, Barry Song wrote:
> > There is no need to always fall back to mmap_lock if the per-VMA
> > lock was released only to wait for pagecache or swapcache to
> > become ready.
>
> Something I've been wondering about is removing all the "drop the MM
> locks while we wait for I/O" gunk. It's a nice amount of code removed:
I think the point is that page fault handlers should avoid holding the VMA
lock or mmap_lock for too long while waiting for I/O. Otherwise, those
writers and readers will be stuck for a while.
>
> include/linux/pagemap.h | 8 +---
> mm/filemap.c | 98 ++++++++++++-------------------------------------
> mm/internal.h | 21 -----------
> mm/memory.c | 13 +------
> mm/shmem.c | 6 ---
> 5 files changed, 27 insertions(+), 119 deletions(-)
>
> and I'm not sure we still need to do it with per-VMA locks. What I
> have here doesn't boot and I ran out of time to debug it.
I agree there’s room for improvement, but merely removing the "drop the MM
locks while waiting for I/O" code is unlikely to improve performance.
For example, we could change the flow to:
1. Release the VMA lock or mmap_lock
2. Lock the folio
3. Re-acquire the VMA lock or mmap_lock
4. Re-check whether we can still map the PTE
5. Map the PTE
Currently, the flow is always:
1. Release the VMA lock or mmap_lock
2. Lock the folio
3. Unlock the folio
4. Re-enter the page fault handling from the beginning
The change would be much more complex, so I’d prefer to land the current
patchset first. At least this way, we avoid falling back to mmap_lock and
causing contention or priority inversion, with minimal changes.
Thanks
Barry
next prev parent reply other threads:[~2025-11-27 4:30 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-27 1:14 Barry Song
2025-11-27 1:14 ` [RFC PATCH 1/2] mm/filemap: Retry fault by VMA lock if the lock was released for I/O Barry Song
2025-11-27 10:52 ` Pedro Falcato
2025-11-27 11:39 ` Barry Song
2025-11-27 16:26 ` Pedro Falcato
2025-11-27 1:14 ` [RFC PATCH 2/2] mm/swapin: Retry swapin " Barry Song
2025-11-27 4:09 ` [RFC PATCH 0/2] mm: continue using per-VMA lock when retrying page faults after I/O Matthew Wilcox
2025-11-27 4:22 ` Barry Song [this message]
2025-11-27 4:42 ` Barry Song
2025-11-27 19:43 ` Matthew Wilcox
2025-11-27 20:29 ` Barry Song
2025-11-27 21:52 ` Barry Song
2025-11-30 0:28 ` Suren Baghdasaryan
2025-11-30 2:56 ` Barry Song
2025-11-30 5:38 ` Shakeel Butt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4zyZeLtxVe56OSYQx0OcjETw2ru1FjZjBOnTszMe_MW2g@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=ada.coupriediaz@arm.com \
--cc=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=bhe@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chenhuacai@kernel.org \
--cc=chrisl@kernel.org \
--cc=christophe.leroy@csgroup.eu \
--cc=cuiyunhui@bytedance.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@kernel.org \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=guanwentao@uniontech.com \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=jarkko@kernel.org \
--cc=kasong@tencent.com \
--cc=kernel@xen0n.name \
--cc=kevin.brodsky@arm.com \
--cc=kristina.martsenko@arm.com \
--cc=kuninori.morimoto.gx@renesas.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=liyangouwen1@oppo.com \
--cc=loongarch@lists.linux.dev \
--cc=lorenzo.stoakes@oracle.com \
--cc=luto@kernel.org \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=namcao@linutronix.de \
--cc=nphamcs@gmail.com \
--cc=npiggin@gmail.com \
--cc=osalvador@suse.de \
--cc=palmer@dabbelt.com \
--cc=peterz@infradead.org \
--cc=pfalcato@suse.de \
--cc=pjw@kernel.org \
--cc=robin.murphy@arm.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=shikemeng@huaweicloud.com \
--cc=surenb@google.com \
--cc=svens@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=thorsten.blum@linux.dev \
--cc=v-songbaohua@oppo.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox