From: Vlastimil Babka <vbabka@suse.cz>
To: Suren Baghdasaryan <surenb@google.com>, akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com,
lorenzo.stoakes@oracle.com, mhocko@suse.com, hannes@cmpxchg.org,
mjguzik@gmail.com, oliver.sang@intel.com,
mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
hughd@google.com, minchan@google.com, jannh@google.com,
shakeel.butt@linux.dev, souravpanda@google.com,
pasha.tatashin@soleen.com, corbet@lwn.net,
linux-doc@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v4 4/5] mm: make vma cache SLAB_TYPESAFE_BY_RCU
Date: Wed, 20 Nov 2024 11:16:45 +0100 [thread overview]
Message-ID: <d3e5cb3a-9c58-477f-a7f7-c96f7e856a9f@suse.cz> (raw)
In-Reply-To: <20241120000826.335387-5-surenb@google.com>
On 11/20/24 01:08, Suren Baghdasaryan wrote:
> To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that
> object reuse before RCU grace period is over will be detected inside
> lock_vma_under_rcu().
> lock_vma_under_rcu() enters RCU read section, finds the vma at the
> given address, locks the vma and checks if it got detached or remapped
> to cover a different address range. These last checks are there
> to ensure that the vma was not modified after we found it but before
> locking it.
> vma reuse introduces several new possibilities:
> 1. vma can be reused after it was found but before it is locked;
> 2. vma can be reused and reinitialized (including changing its vm_mm)
> while being locked in vma_start_read();
> 3. vma can be reused and reinitialized after it was found but before
> it is locked, then attached at a new address or to a new mm while being
> read-locked;
> For case #1 current checks will help detecting cases when:
> - vma was reused but not yet added into the tree (detached check)
> - vma was reused at a different address range (address check);
> We are missing the check for vm_mm to ensure the reused vma was not
> attached to a different mm. This patch adds the missing check.
> For case #2, we pass mm to vma_start_read() to prevent access to
> unstable vma->vm_mm.
So we may now be looking at different mm's mm_lock_seq.sequence and return a
false unlocked result, right? I guess the mm validation in
lock_vma_under_rcu() handles that, but maybe the comment of vma_start_read()
needs updating.
> For case #3, we ensure the order in which vma->detached flag and
> vm_start/vm_end/vm_mm are set and checked. vma gets attached after
> vm_start/vm_end/vm_mm were set and lock_vma_under_rcu() should check
> vma->detached before checking vm_start/vm_end/vm_mm. This is required
> because attaching vma happens without vma write-lock, as opposed to
> vma detaching, which requires vma write-lock. This patch adds memory
> barriers inside is_vma_detached() and vma_mark_attached() needed to
> order reads and writes to vma->detached vs vm_start/vm_end/vm_mm.
> After these provisions, SLAB_TYPESAFE_BY_RCU is added to vm_area_cachep.
> This will facilitate vm_area_struct reuse and will minimize the number
> of call_rcu() calls.
> Adding a freeptr_t into vm_area_struct (unioned with vm_start/vm_end)
> could be used to avoids bloating the structure, however currently
> custom free pointers are not supported in combination with a ctor
> (see the comment for kmem_cache_args.freeptr_offset).
I think there's nothing fundamental preventing to support that, there was
just no user of it. We can do it later.
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -436,6 +436,11 @@ static struct kmem_cache *vm_area_cachep;
> /* SLAB cache for mm_struct structures (tsk->mm) */
> static struct kmem_cache *mm_cachep;
>
> +static void vm_area_ctor(void *data)
> +{
> + vma_lock_init(data);
> +}
> +
> struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
> {
> struct vm_area_struct *vma;
> @@ -462,8 +467,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig)
> * orig->shared.rb may be modified concurrently, but the clone
> * will be reinitialized.
> */
> - data_race(memcpy(new, orig, sizeof(*new)));
> - vma_lock_init(new);
> + vma_copy(new, orig);
> INIT_LIST_HEAD(&new->anon_vma_chain);
> #ifdef CONFIG_PER_VMA_LOCK
> /* vma is not locked, can't use vma_mark_detached() */
Here we mark it detached but we might have already copied it as attached and
confused a reader?
I think this will be covered by what you said in reply to willy:
"vma_copy() will have to also copy vma members individually."
> @@ -475,32 +479,37 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig)
> return new;
> }
>
next prev parent reply other threads:[~2024-11-20 10:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 0:08 [PATCH v4 0/5] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-20 0:08 ` [PATCH v4 1/5] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-11-20 22:11 ` Shakeel Butt
2024-11-20 0:08 ` [PATCH v4 2/5] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-20 23:32 ` Shakeel Butt
2024-11-20 23:44 ` Suren Baghdasaryan
2024-11-21 0:04 ` Shakeel Butt
2024-11-21 0:33 ` Suren Baghdasaryan
2024-11-21 7:01 ` Shakeel Butt
2024-11-21 17:05 ` Suren Baghdasaryan
2024-11-21 18:25 ` Shakeel Butt
2024-11-20 0:08 ` [PATCH v4 3/5] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2024-11-21 0:13 ` Shakeel Butt
2024-11-22 16:46 ` Lorenzo Stoakes
2024-11-22 17:47 ` Suren Baghdasaryan
2024-11-20 0:08 ` [PATCH v4 4/5] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2024-11-20 4:36 ` Matthew Wilcox
2024-11-20 6:37 ` Suren Baghdasaryan
2024-11-22 22:43 ` Suren Baghdasaryan
2024-11-20 10:16 ` Vlastimil Babka [this message]
2024-11-20 15:54 ` Suren Baghdasaryan
2024-11-20 0:08 ` [PATCH v4 5/5] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2024-11-20 22:10 ` [PATCH v4 0/5] move per-vma lock into vm_area_struct Shakeel Butt
2024-11-20 23:52 ` Suren Baghdasaryan
2024-11-21 2:00 ` Matthew Wilcox
2024-11-22 11:56 ` Lorenzo Stoakes
2024-11-22 15:06 ` Suren Baghdasaryan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d3e5cb3a-9c58-477f-a7f7-c96f7e856a9f@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=dave@stgolabs.net \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hdanton@sina.com \
--cc=hughd@google.com \
--cc=jannh@google.com \
--cc=kernel-team@android.com \
--cc=liam.howlett@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@google.com \
--cc=mjguzik@gmail.com \
--cc=oleg@redhat.com \
--cc=oliver.sang@intel.com \
--cc=pasha.tatashin@soleen.com \
--cc=paulmck@kernel.org \
--cc=peterx@redhat.com \
--cc=shakeel.butt@linux.dev \
--cc=souravpanda@google.com \
--cc=surenb@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox