linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Suren Baghdasaryan <surenb@google.com>, akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com,
	lorenzo.stoakes@oracle.com, mhocko@suse.com, hannes@cmpxchg.org,
	mjguzik@gmail.com, oliver.sang@intel.com,
	mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
	oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
	hughd@google.com, minchan@google.com, jannh@google.com,
	shakeel.butt@linux.dev, souravpanda@google.com,
	pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH v4 4/5] mm: make vma cache SLAB_TYPESAFE_BY_RCU
Date: Wed, 20 Nov 2024 11:16:45 +0100	[thread overview]
Message-ID: <d3e5cb3a-9c58-477f-a7f7-c96f7e856a9f@suse.cz> (raw)
In-Reply-To: <20241120000826.335387-5-surenb@google.com>

On 11/20/24 01:08, Suren Baghdasaryan wrote:
> To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that
> object reuse before RCU grace period is over will be detected inside
> lock_vma_under_rcu().
> lock_vma_under_rcu() enters RCU read section, finds the vma at the
> given address, locks the vma and checks if it got detached or remapped
> to cover a different address range. These last checks are there
> to ensure that the vma was not modified after we found it but before
> locking it.
> vma reuse introduces several new possibilities:
> 1. vma can be reused after it was found but before it is locked;
> 2. vma can be reused and reinitialized (including changing its vm_mm)
> while being locked in vma_start_read();
> 3. vma can be reused and reinitialized after it was found but before
> it is locked, then attached at a new address or to a new mm while being
> read-locked;
> For case #1 current checks will help detecting cases when:
> - vma was reused but not yet added into the tree (detached check)
> - vma was reused at a different address range (address check);
> We are missing the check for vm_mm to ensure the reused vma was not
> attached to a different mm. This patch adds the missing check.
> For case #2, we pass mm to vma_start_read() to prevent access to
> unstable vma->vm_mm.

So we may now be looking at different mm's mm_lock_seq.sequence and return a
false unlocked result, right? I guess the mm validation in
lock_vma_under_rcu() handles that, but maybe the comment of vma_start_read()
needs updating.

> For case #3, we ensure the order in which vma->detached flag and
> vm_start/vm_end/vm_mm are set and checked. vma gets attached after
> vm_start/vm_end/vm_mm were set and lock_vma_under_rcu() should check
> vma->detached before checking vm_start/vm_end/vm_mm. This is required
> because attaching vma happens without vma write-lock, as opposed to
> vma detaching, which requires vma write-lock. This patch adds memory
> barriers inside is_vma_detached() and vma_mark_attached() needed to
> order reads and writes to vma->detached vs vm_start/vm_end/vm_mm.
> After these provisions, SLAB_TYPESAFE_BY_RCU is added to vm_area_cachep.
> This will facilitate vm_area_struct reuse and will minimize the number
> of call_rcu() calls.
> Adding a freeptr_t into vm_area_struct (unioned with vm_start/vm_end)
> could be used to avoids bloating the structure, however currently
> custom free pointers are not supported in combination with a ctor
> (see the comment for kmem_cache_args.freeptr_offset).

I think there's nothing fundamental preventing to support that, there was
just no user of it. We can do it later.

> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -436,6 +436,11 @@ static struct kmem_cache *vm_area_cachep;
>  /* SLAB cache for mm_struct structures (tsk->mm) */
>  static struct kmem_cache *mm_cachep;
>  
> +static void vm_area_ctor(void *data)
> +{
> +	vma_lock_init(data);
> +}
> +
>  struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
>  {
>  	struct vm_area_struct *vma;
> @@ -462,8 +467,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig)
>  	 * orig->shared.rb may be modified concurrently, but the clone
>  	 * will be reinitialized.
>  	 */
> -	data_race(memcpy(new, orig, sizeof(*new)));
> -	vma_lock_init(new);
> +	vma_copy(new, orig);
>  	INIT_LIST_HEAD(&new->anon_vma_chain);
>  #ifdef CONFIG_PER_VMA_LOCK
>  	/* vma is not locked, can't use vma_mark_detached() */

Here we mark it detached but we might have already copied it as attached and
confused a reader?

I think this will be covered by what you said in reply to willy:
"vma_copy() will have to also copy vma members individually."

> @@ -475,32 +479,37 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig)
>  	return new;
>  }
>  


  parent reply	other threads:[~2024-11-20 10:16 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-20  0:08 [PATCH v4 0/5] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-20  0:08 ` [PATCH v4 1/5] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-11-20 22:11   ` Shakeel Butt
2024-11-20  0:08 ` [PATCH v4 2/5] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-20 23:32   ` Shakeel Butt
2024-11-20 23:44     ` Suren Baghdasaryan
2024-11-21  0:04       ` Shakeel Butt
2024-11-21  0:33         ` Suren Baghdasaryan
2024-11-21  7:01           ` Shakeel Butt
2024-11-21 17:05             ` Suren Baghdasaryan
2024-11-21 18:25               ` Shakeel Butt
2024-11-20  0:08 ` [PATCH v4 3/5] mm: mark vma as detached until it's added into vma tree Suren Baghdasaryan
2024-11-21  0:13   ` Shakeel Butt
2024-11-22 16:46   ` Lorenzo Stoakes
2024-11-22 17:47     ` Suren Baghdasaryan
2024-11-20  0:08 ` [PATCH v4 4/5] mm: make vma cache SLAB_TYPESAFE_BY_RCU Suren Baghdasaryan
2024-11-20  4:36   ` Matthew Wilcox
2024-11-20  6:37     ` Suren Baghdasaryan
2024-11-22 22:43       ` Suren Baghdasaryan
2024-11-20 10:16   ` Vlastimil Babka [this message]
2024-11-20 15:54     ` Suren Baghdasaryan
2024-11-20  0:08 ` [PATCH v4 5/5] docs/mm: document latest changes to vm_lock Suren Baghdasaryan
2024-11-20 22:10 ` [PATCH v4 0/5] move per-vma lock into vm_area_struct Shakeel Butt
2024-11-20 23:52   ` Suren Baghdasaryan
2024-11-21  2:00   ` Matthew Wilcox
2024-11-22 11:56     ` Lorenzo Stoakes
2024-11-22 15:06       ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d3e5cb3a-9c58-477f-a7f7-c96f7e856a9f@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=mjguzik@gmail.com \
    --cc=oleg@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=shakeel.butt@linux.dev \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox