From: Andrii Nakryiko <andrii@kernel.org>
To: linux-fsdevel@vger.kernel.org, brauner@kernel.org,
viro@zeniv.linux.org.uk, akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
gregkh@linuxfoundation.org, linux-mm@kvack.org,
liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org,
Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH v2 4/9] fs/procfs: use per-VMA RCU-protected locking in PROCMAP_QUERY API
Date: Thu, 23 May 2024 21:10:26 -0700 [thread overview]
Message-ID: <20240524041032.1048094-5-andrii@kernel.org> (raw)
In-Reply-To: <20240524041032.1048094-1-andrii@kernel.org>
Attempt to use RCU-protected per-VAM lock when looking up requested VMA
as much as possible, only falling back to mmap_lock if per-VMA lock
failed. This is done so that querying of VMAs doesn't interfere with
other critical tasks, like page fault handling.
This has been suggested by mm folks, and we make use of a newly added
internal API that works like find_vma(), but tries to use per-VMA lock.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
fs/proc/task_mmu.c | 42 ++++++++++++++++++++++++++++++++++--------
1 file changed, 34 insertions(+), 8 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 8ad547efd38d..2b14d06d1def 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -389,12 +389,30 @@ static int pid_maps_open(struct inode *inode, struct file *file)
)
static struct vm_area_struct *query_matching_vma(struct mm_struct *mm,
- unsigned long addr, u32 flags)
+ unsigned long addr, u32 flags,
+ bool *mm_locked)
{
struct vm_area_struct *vma;
+ bool mmap_locked;
+
+ *mm_locked = mmap_locked = false;
next_vma:
- vma = find_vma(mm, addr);
+ if (!mmap_locked) {
+ /* if we haven't yet acquired mmap_lock, try to use less disruptive per-VMA */
+ vma = find_and_lock_vma_rcu(mm, addr);
+ if (IS_ERR(vma)) {
+ /* failed to take per-VMA lock, fallback to mmap_lock */
+ if (mmap_read_lock_killable(mm))
+ return ERR_PTR(-EINTR);
+
+ *mm_locked = mmap_locked = true;
+ vma = find_vma(mm, addr);
+ }
+ } else {
+ /* if we have mmap_lock, get through the search as fast as possible */
+ vma = find_vma(mm, addr);
+ }
/* no VMA found */
if (!vma)
@@ -428,18 +446,25 @@ static struct vm_area_struct *query_matching_vma(struct mm_struct *mm,
skip_vma:
/*
* If the user needs closest matching VMA, keep iterating.
+ * But before we proceed we might need to unlock current VMA.
*/
addr = vma->vm_end;
+ if (!mmap_locked)
+ vma_end_read(vma);
if (flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA)
goto next_vma;
no_vma:
- mmap_read_unlock(mm);
+ if (mmap_locked)
+ mmap_read_unlock(mm);
return ERR_PTR(-ENOENT);
}
-static void unlock_vma(struct vm_area_struct *vma)
+static void unlock_vma(struct vm_area_struct *vma, bool mm_locked)
{
- mmap_read_unlock(vma->vm_mm);
+ if (mm_locked)
+ mmap_read_unlock(vma->vm_mm);
+ else
+ vma_end_read(vma);
}
static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg)
@@ -447,6 +472,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg)
struct procmap_query karg;
struct vm_area_struct *vma;
struct mm_struct *mm;
+ bool mm_locked;
const char *name = NULL;
char *name_buf = NULL;
__u64 usize;
@@ -475,7 +501,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg)
if (!mm || !mmget_not_zero(mm))
return -ESRCH;
- vma = query_matching_vma(mm, karg.query_addr, karg.query_flags);
+ vma = query_matching_vma(mm, karg.query_addr, karg.query_flags, &mm_locked);
if (IS_ERR(vma)) {
mmput(mm);
return PTR_ERR(vma);
@@ -542,7 +568,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg)
}
/* unlock vma/mm_struct and put mm_struct before copying data to user */
- unlock_vma(vma);
+ unlock_vma(vma, mm_locked);
mmput(mm);
if (karg.vma_name_size && copy_to_user((void __user *)karg.vma_name_addr,
@@ -558,7 +584,7 @@ static int do_procmap_query(struct proc_maps_private *priv, void __user *uarg)
return 0;
out:
- unlock_vma(vma);
+ unlock_vma(vma, mm_locked);
mmput(mm);
kfree(name_buf);
return err;
--
2.43.0
next prev parent reply other threads:[~2024-05-24 4:10 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-24 4:10 [PATCH v2 0/9] ioctl()-based API to query VMAs from /proc/<pid>/maps Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 1/9] mm: add find_vma()-like API but RCU protected and taking VMA lock Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 2/9] fs/procfs: extract logic for getting VMA name constituents Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 3/9] fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps Andrii Nakryiko
2024-05-24 4:10 ` Andrii Nakryiko [this message]
2024-05-24 19:47 ` [PATCH v2 4/9] fs/procfs: use per-VMA RCU-protected locking in PROCMAP_QUERY API Liam R. Howlett
2024-05-28 20:36 ` Andrii Nakryiko
2024-05-31 13:37 ` Liam R. Howlett
2024-05-31 16:37 ` Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 5/9] fs/procfs: add build ID fetching to " Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 6/9] docs/procfs: call out ioctl()-based PROCMAP_QUERY command existence Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 7/9] tools: sync uapi/linux/fs.h header into tools subdir Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 8/9] selftests/bpf: make use of PROCMAP_QUERY ioctl if available Andrii Nakryiko
2024-05-24 4:10 ` [PATCH v2 9/9] selftests/bpf: add simple benchmark tool for /proc/<pid>/maps APIs Andrii Nakryiko
2024-05-24 17:32 ` [PATCH v2 0/9] ioctl()-based API to query VMAs from /proc/<pid>/maps Andrew Morton
2024-05-24 19:30 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240524041032.1048094-5-andrii@kernel.org \
--to=andrii@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=liam.howlett@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox