From: Andrii Nakryiko <andrii@kernel.org>
To: linux-fsdevel@vger.kernel.org, brauner@kernel.org,
viro@zeniv.linux.org.uk, akpm@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
gregkh@linuxfoundation.org, linux-mm@kvack.org,
liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org,
Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH v3 1/9] mm: add find_vma()-like API but RCU protected and taking VMA lock
Date: Tue, 4 Jun 2024 17:24:46 -0700 [thread overview]
Message-ID: <20240605002459.4091285-2-andrii@kernel.org> (raw)
In-Reply-To: <20240605002459.4091285-1-andrii@kernel.org>
Existing lock_vma_under_rcu() API assumes exact VMA match, so it's not
a 100% equivalent of find_vma(). There are use cases that do want
find_vma() semantics of finding an exact VMA or the next one.
Also, it's important for such an API to let user distinguish between not
being able to get per-VMA lock and not having any VMAs at or after
provided address.
As such, this patch adds a new find_vma()-like API,
find_and_lock_vma_rcu(), which finds exact or next VMA, attempts to take
per-VMA lock, and if that fails, returns ERR_PTR(-EBUSY). It still
returns NULL if there is no VMA at or after address. In successfuly case
it will return valid and non-isolated VMA with VMA lock taken.
This API will be used in subsequent patch in this patch set to implement
a new user-facing API for querying process VMAs.
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
include/linux/mm.h | 8 ++++++
mm/memory.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 70 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c41c82bcbec2..3ab52b7e124c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -776,6 +776,8 @@ static inline void assert_fault_locked(struct vm_fault *vmf)
mmap_assert_locked(vmf->vma->vm_mm);
}
+struct vm_area_struct *find_and_lock_vma_rcu(struct mm_struct *mm,
+ unsigned long address);
struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
unsigned long address);
@@ -790,6 +792,12 @@ static inline void vma_assert_write_locked(struct vm_area_struct *vma)
static inline void vma_mark_detached(struct vm_area_struct *vma,
bool detached) {}
+struct vm_area_struct *find_and_lock_vma_rcu(struct mm_struct *mm,
+ unsigned long address)
+{
+ return -EOPNOTSUPP;
+}
+
static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
unsigned long address)
{
diff --git a/mm/memory.c b/mm/memory.c
index eef4e482c0c2..c9517742bd6d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5913,6 +5913,68 @@ struct vm_area_struct *lock_mm_and_find_vma(struct mm_struct *mm,
#endif
#ifdef CONFIG_PER_VMA_LOCK
+/*
+ * find_and_lock_vma_rcu() - Find and lock the VMA for a given address, or the
+ * next VMA. Search is done under RCU protection, without taking or assuming
+ * mmap_lock. Returned VMA is guaranteed to be stable and not isolated.
+
+ * @mm: The mm_struct to check
+ * @addr: The address
+ *
+ * Returns: The VMA associated with addr, or the next VMA.
+ * May return %NULL in the case of no VMA at addr or above.
+ * If the VMA is being modified and can't be locked, -EBUSY is returned.
+ */
+struct vm_area_struct *find_and_lock_vma_rcu(struct mm_struct *mm,
+ unsigned long address)
+{
+ MA_STATE(mas, &mm->mm_mt, address, address);
+ struct vm_area_struct *vma;
+ int err;
+
+ rcu_read_lock();
+retry:
+ vma = mas_find(&mas, ULONG_MAX);
+ if (!vma) {
+ err = 0; /* no VMA, return NULL */
+ goto inval;
+ }
+
+ if (!vma_start_read(vma)) {
+ err = -EBUSY;
+ goto inval;
+ }
+
+ /*
+ * Check since vm_start/vm_end might change before we lock the VMA.
+ * Note, unlike lock_vma_under_rcu() we are searching for VMA covering
+ * address or the next one, so we only make sure VMA wasn't updated to
+ * end before the address.
+ */
+ if (unlikely(vma->vm_end <= address)) {
+ err = -EBUSY;
+ goto inval_end_read;
+ }
+
+ /* Check if the VMA got isolated after we found it */
+ if (vma->detached) {
+ vma_end_read(vma);
+ count_vm_vma_lock_event(VMA_LOCK_MISS);
+ /* The area was replaced with another one */
+ goto retry;
+ }
+
+ rcu_read_unlock();
+ return vma;
+
+inval_end_read:
+ vma_end_read(vma);
+inval:
+ rcu_read_unlock();
+ count_vm_vma_lock_event(VMA_LOCK_ABORT);
+ return ERR_PTR(err);
+}
+
/*
* Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be
* stable and not isolated. If the VMA is not found or is being modified the
--
2.43.0
next prev parent reply other threads:[~2024-06-05 0:25 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-05 0:24 [PATCH v3 0/9] ioctl()-based API to query VMAs from /proc/<pid>/maps Andrii Nakryiko
2024-06-05 0:24 ` Andrii Nakryiko [this message]
2024-06-05 0:57 ` [PATCH v3 1/9] mm: add find_vma()-like API but RCU protected and taking VMA lock Matthew Wilcox
2024-06-05 13:33 ` Liam R. Howlett
2024-06-05 16:13 ` Andrii Nakryiko
2024-06-05 16:24 ` Andrii Nakryiko
2024-06-05 16:27 ` Andrii Nakryiko
2024-06-05 17:03 ` Liam R. Howlett
2024-06-05 23:22 ` Suren Baghdasaryan
2024-06-06 16:51 ` Andrii Nakryiko
2024-06-06 17:13 ` Suren Baghdasaryan
2024-06-05 0:24 ` [PATCH v3 2/9] fs/procfs: extract logic for getting VMA name constituents Andrii Nakryiko
2024-06-05 0:24 ` [PATCH v3 3/9] fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps Andrii Nakryiko
2024-06-07 22:31 ` Andrei Vagin
2024-06-10 8:17 ` Andrii Nakryiko
2024-06-12 17:48 ` Andrei Vagin
2024-06-05 0:24 ` [PATCH v3 4/9] fs/procfs: use per-VMA RCU-protected locking in PROCMAP_QUERY API Andrii Nakryiko
2024-06-05 23:15 ` Suren Baghdasaryan
2024-06-06 16:51 ` Andrii Nakryiko
2024-06-06 17:12 ` Suren Baghdasaryan
2024-06-06 18:03 ` Andrii Nakryiko
2024-06-06 17:15 ` Liam R. Howlett
2024-06-06 17:33 ` Suren Baghdasaryan
2024-06-06 18:07 ` Liam R. Howlett
2024-06-06 18:09 ` Andrii Nakryiko
2024-06-06 18:32 ` Liam R. Howlett
2024-06-05 0:24 ` [PATCH v3 5/9] fs/procfs: add build ID fetching to " Andrii Nakryiko
2024-06-05 0:24 ` [PATCH v3 6/9] docs/procfs: call out ioctl()-based PROCMAP_QUERY command existence Andrii Nakryiko
2024-06-05 0:24 ` [PATCH v3 7/9] tools: sync uapi/linux/fs.h header into tools subdir Andrii Nakryiko
2024-06-05 0:24 ` [PATCH v3 8/9] selftests/bpf: make use of PROCMAP_QUERY ioctl if available Andrii Nakryiko
2024-06-05 0:24 ` [PATCH v3 9/9] selftests/bpf: add simple benchmark tool for /proc/<pid>/maps APIs Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240605002459.4091285-2-andrii@kernel.org \
--to=andrii@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brauner@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=liam.howlett@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox