Re: [PATCH 0/4] move per-vma lock into vm_area_struct

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: "Liam R. Howlett" <Liam.Howlett@oracle.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, willy@infradead.org,
	lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com,
	mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com,
	oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com,
	hughd@google.com, minchan@google.com, jannh@google.com,
	shakeel.butt@linux.dev, souravpanda@google.com,
	pasha.tatashin@soleen.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH 0/4] move per-vma lock into vm_area_struct
Date: Mon, 11 Nov 2024 21:35:19 -0500	[thread overview]
Message-ID: <pbcublsqoi6yeev767ezc7qfmcvxipbvz2towavhfgzlmzt26r@h3cwlmrmt7da> (raw)
In-Reply-To: <20241111205506.3404479-1-surenb@google.com>

* Suren Baghdasaryan <surenb@google.com> [241111 15:55]:
> Back when per-vma locks were introduces, vm_lock was moved out of
> vm_area_struct in [1] because of the performance regression caused by
> false cacheline sharing. Recent investigation [2] revealed that the
> regressions is limited to a rather old Broadwell microarchitecture and
> even there it can be mitigated by disabling adjacent cacheline
> prefetching, see [3].
> This patchset moves vm_lock back into vm_area_struct, aligning it at the
> cacheline boundary and changing the cache to be cache-aligned as well.
> This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40
> (vm_lock) bytes to 256 bytes:
> 
>     slabinfo before:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vma_lock         ...     40  102    1 : ...
>      vm_area_struct   ...    160   51    2 : ...
> 
>     slabinfo after moving vm_lock:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vm_area_struct   ...    256   32    2 : ...
> 
> Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages,
> which is 5.5MB per 100000 VMAs.
> To minimize memory overhead, vm_lock implementation is changed from
> using rw_semaphore (40 bytes) to an atomic (8 bytes) and several
> vm_area_struct members are moved into the last cacheline, resulting
> in a less fragmented structure:
> 
> struct vm_area_struct {
> 	union {
> 		struct {
> 			long unsigned int vm_start;      /*     0     8 */
> 			long unsigned int vm_end;        /*     8     8 */
> 		};                                       /*     0    16 */
> 		struct callback_head vm_rcu ;            /*     0    16 */
> 	} __attribute__((__aligned__(8)));               /*     0    16 */
> 	struct mm_struct *         vm_mm;                /*    16     8 */
> 	pgprot_t                   vm_page_prot;         /*    24     8 */
> 	union {
> 		const vm_flags_t   vm_flags;             /*    32     8 */
> 		vm_flags_t         __vm_flags;           /*    32     8 */
> 	};                                               /*    32     8 */
> 	bool                       detached;             /*    40     1 */
> 
> 	/* XXX 3 bytes hole, try to pack */
> 
> 	unsigned int               vm_lock_seq;          /*    44     4 */ 
> 	struct list_head           anon_vma_chain;       /*    48    16 */ 40 + 16
> 	/* --- cacheline 1 boundary (64 bytes) --- */
> 	struct anon_vma *          anon_vma;             /*    64     8 */ 56 + 8
> 	const struct vm_operations_struct  * vm_ops;     /*    72     8 */
> 	long unsigned int          vm_pgoff;             /*    80     8 */
> 	struct file *              vm_file;              /*    88     8 */
> 	void *                     vm_private_data;      /*    96     8 */
> 	atomic_long_t              swap_readahead_info;  /*   104     8 */
> 	struct mempolicy *         vm_policy;            /*   112     8 */
> 
> 	/* XXX 8 bytes hole, try to pack */
> 
> 	/* --- cacheline 2 boundary (128 bytes) --- */
> 	struct vma_lock       vm_lock (__aligned__(64)); /*   128     4 */
> 
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	struct {
> 		struct rb_node     rb (__aligned__(8));  /*   136    24 */
> 		long unsigned int  rb_subtree_last;      /*   160     8 */
> 	} __attribute__((__aligned__(8))) shared;        /*   136    32 */
> 	struct vm_userfaultfd_ctx  vm_userfaultfd_ctx;   /*   168     0 */

This is 8 bytes on my compile, I guess you have userfaultfd disabled?

Considering this will end up being 256 anyways these changes may not
matter, but we can pack this better.
1. move vm_lock to after anon_vma (ends up at 64B in the end)
2. move vm_lock_seq to after vm_lock
4. move struct to the new 112 offset (which is 8B aligned at 112)
3. move detached to the end of the structure

This should limit the holes and maintain the alignments.

The down side is detached is now in the last used cache line and away
from what would probably be used with it, but maybe it doesn't matter
considering prefetch.

But maybe you have other reasons for the placement?

> 
> 	/* size: 192, cachelines: 3, members: 17 */
> 	/* sum members: 153, holes: 3, sum holes: 15 */
> 	/* padding: 24 */
> 	/* forced alignments: 3, forced holes: 2, sum forced holes: 12 */
> } __attribute__((__aligned__(64)));
> 
> Memory consumption per 1000 VMAs becomes 48 pages, saving 2 pages compared
> to the 50 pages in the baseline:
> 
>     slabinfo after vm_area_struct changes:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vm_area_struct   ...    192   42    2 : ...
> 
> Performance measurements using pft test on x86 do not show considerable
> difference, on Pixel 6 running Android it results in 3-5% improvement in
> faults per second.
> 
> [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/
> [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/
> [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/
> 
> Suren Baghdasaryan (4):
>   mm: introduce vma_start_read_locked{_nested} helpers
>   mm: move per-vma lock into vm_area_struct
>   mm: replace rw_semaphore with atomic_t in vma_lock
>   mm: move lesser used vma_area_struct members into the last cacheline
> 
>  include/linux/mm.h        | 163 +++++++++++++++++++++++++++++++++++---
>  include/linux/mm_types.h  |  59 +++++++++-----
>  include/linux/mmap_lock.h |   3 +
>  kernel/fork.c             |  50 ++----------
>  mm/init-mm.c              |   2 +
>  mm/userfaultfd.c          |  14 ++--
>  6 files changed, 205 insertions(+), 86 deletions(-)
> 
> 
> base-commit: 931086f2a88086319afb57cd3925607e8cda0a9f
> -- 
> 2.47.0.277.g8800431eea-goog
>

next prev parent reply	other threads:[~2024-11-12  2:35 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-11 20:55 Suren Baghdasaryan
2024-11-11 20:55 ` [PATCH 1/4] mm: introduce vma_start_read_locked{_nested} helpers Suren Baghdasaryan
2024-11-11 20:55 ` [PATCH 2/4] mm: move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-12  9:36   ` kernel test robot
2024-11-12  9:47   ` kernel test robot
2024-11-12 10:18   ` kernel test robot
2024-11-12 15:57   ` Vlastimil Babka
2024-11-12 16:08     ` Suren Baghdasaryan
2024-11-12 16:58       ` Suren Baghdasaryan
2024-11-11 20:55 ` [PATCH 3/4] mm: replace rw_semaphore with atomic_t in vma_lock Suren Baghdasaryan
2024-11-12  0:06   ` Andrew Morton
2024-11-12  0:52     ` Suren Baghdasaryan
2024-11-12  0:35   ` Davidlohr Bueso
2024-11-12  0:59     ` Suren Baghdasaryan
2024-11-12  4:58   ` Matthew Wilcox
2024-11-12 15:18     ` Suren Baghdasaryan
2024-12-10 22:38       ` Peter Zijlstra
2024-12-10 23:37         ` Suren Baghdasaryan
2024-12-11  8:25           ` Peter Zijlstra
2024-12-11 15:20             ` Suren Baghdasaryan
2024-12-12  3:01               ` Suren Baghdasaryan
2024-12-12  9:16                 ` Peter Zijlstra
2024-12-12 14:17                   ` Suren Baghdasaryan
2024-12-12 14:19                     ` Suren Baghdasaryan
2024-12-13  4:48                       ` Suren Baghdasaryan
2024-12-13  9:57                         ` Peter Zijlstra
2024-12-13 17:45                           ` Suren Baghdasaryan
2024-12-13 18:19                             ` Matthew Wilcox
2024-12-13 18:35                             ` Peter Zijlstra
2024-12-13 18:37                               ` Suren Baghdasaryan
2024-12-16 19:32                                 ` Suren Baghdasaryan
2024-12-13  9:22                     ` Peter Zijlstra
2024-12-13 17:41                       ` Suren Baghdasaryan
2024-11-11 20:55 ` [PATCH 4/4] mm: move lesser used vma_area_struct members into the last cacheline Suren Baghdasaryan
2024-11-12 10:07   ` Vlastimil Babka
2024-11-12 15:45     ` Suren Baghdasaryan
2024-11-11 21:41 ` [PATCH 0/4] move per-vma lock into vm_area_struct Suren Baghdasaryan
2024-11-12  2:48   ` Liam R. Howlett
2024-11-12  3:27     ` Suren Baghdasaryan
2024-11-11 22:18 ` Davidlohr Bueso
2024-11-11 23:19   ` Suren Baghdasaryan
2024-11-12  0:03     ` Davidlohr Bueso
2024-11-12  0:43       ` Suren Baghdasaryan
2024-11-12  0:11     ` Andrew Morton
2024-11-12  0:48       ` Suren Baghdasaryan
2024-11-12  2:35 ` Liam R. Howlett [this message]
2024-11-12  3:23   ` Suren Baghdasaryan
2024-11-12  9:39 ` Lorenzo Stoakes
2024-11-12 15:38   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pbcublsqoi6yeev767ezc7qfmcvxipbvz2towavhfgzlmzt26r@h3cwlmrmt7da \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=mjguzik@gmail.com \
    --cc=oleg@redhat.com \
    --cc=oliver.sang@intel.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=shakeel.butt@linux.dev \
    --cc=souravpanda@google.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox