From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B20F5D3ABF4
	for <linux-mm@archiver.kernel.org>; Mon, 11 Nov 2024 21:41:23 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 34C406B00B8; Mon, 11 Nov 2024 16:41:23 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 2FC1D6B00BA; Mon, 11 Nov 2024 16:41:23 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 14EDE6B00BC; Mon, 11 Nov 2024 16:41:23 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id C14326B00B8
	for <linux-mm@kvack.org>; Mon, 11 Nov 2024 16:41:22 -0500 (EST)
Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id 71EA514113A
	for <linux-mm@kvack.org>; Mon, 11 Nov 2024 21:41:22 +0000 (UTC)
X-FDA: 82775132790.20.5351969
Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176])
	by imf16.hostedemail.com (Postfix) with ESMTP id E78B7180017
	for <linux-mm@kvack.org>; Mon, 11 Nov 2024 21:40:39 +0000 (UTC)
Authentication-Results: imf16.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=hxemGVGx;
	spf=pass (imf16.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1731361226;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=;
	b=jDHsvZeGO4rFgvIRngEEHdE0qQwnwwQAp6Q+k4F9Azclv3k0UyMWlZxr5NUmcEYfxfnDsv
	N3540Tq7monErTCw7GFYdrByWvHzQMwCicwW9t74yPrn7P+bpT0q5t19S7eZMExGskJlRD
	gDTmegNRlxA/eWECkpdFSUb4uzKk2J4=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731361226; a=rsa-sha256;
	cv=none;
	b=JwY+5k/XfdmE7OdhH907HGtDpdaz+go0f8LEAabZXwOjbBYfBi2Lw7onWIN2tjPA/ScwYG
	hkhFlGO2I9F2FnWMgxfiKn4/mfiFEQ769AYT7Qmtk9W7WNn2gdnkC8VHh2Mz2k2Sx6298N
	qqT3gZXi9FX98301V4ha7qmbpgjne8g=
ARC-Authentication-Results: i=1;
	imf16.hostedemail.com;
	dkim=pass header.d=google.com header.s=20230601 header.b=hxemGVGx;
	spf=pass (imf16.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com;
	dmarc=pass (policy=reject) header.from=google.com
Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4608dddaa35so84501cf.0
        for <linux-mm@kvack.org>; Mon, 11 Nov 2024 13:41:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1731361280; x=1731966080; darn=kvack.org;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=;
        b=hxemGVGx2u7yp71lCKXcBax45F+EnRLy1tM1ZGDkZ3xc2kJJK2hJdXiapE5Y7O9StM
         Lti480Ip7JBKqTWJenbO4kMczCg8JqvT6xLchhMA3n0glkoI/Su1qZ3juiuscXVCb41a
         K47ZaA8CZXpkZxcHGlaQEX7pj04YqwPoPrl6tfjCKkWRidLnuQdiTT9FB0PDGgm4o+4O
         kMDmi+YC46gT36iEZsQJ/z2X2VSnzrVgBjkT0XUAhNbYPv9VpwDyzUEqfNsFVnnrmmJM
         jjdFZXRQlwRR1z734d+lGd8O8paIATc5ylHGmAh296ErY2tyjOOAFocgVd624jdeWX6f
         bIHg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1731361280; x=1731966080;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=;
        b=PtlLBgk/4W1eIuwpBBAOP0QppZwX7/Qc9nuL/obeztmWiPTX35CdT9DGaqACWfU4sO
         3hAkAE1RVc02zObQNMpPJjswY5Y6yc/Zg/i8EKpHGYxhbKQip0NRF480qBDkp8k+e9rJ
         pB0ZG6wa16+oLeVo3qvlufcOPFwpheT4XeGiIqvX7x70L4khhVVt4w03gMtEkae/+xa5
         Zf/7KIgpRprBWoLJCYIB8VZK8AoxmgXHvP+ivk+kEos3DfyoAO1d3U56wZzuVcmgBJCV
         DjJY2tnorlOeefkwVlCxciL9EJGE3o4VaKzG4zfu+DzNZnMhAtFF6RAz/Hpt6X3Xpdr0
         OpdQ==
X-Forwarded-Encrypted: i=1; AJvYcCUrBL50U+FUBy23Ni98iYYW0ePLvWmr36UqLO0RX0/pV3AYSohDZvpW1CtQI28gYhhxxRGI7YSWeQ==@kvack.org
X-Gm-Message-State: AOJu0YzBXqoc/UR3RwRHCga3JMSfK3NxVaqgExsmi2nmhDKJ3U1p0Eyy
	BVlFdz4u+RPLIYEW4zQ6ljPM3Znt1jnU/7MErc9xK2C9eaQU3DfH2tOLgMjDj7LlPsI36lOzdyw
	6EG2q0OOqoY4+ud4sM4+KKf+AKT2WLp3IfUIn
X-Gm-Gg: ASbGnct+b62xiBxzu+WlgsYAypWyEObum4rvij4p++pz7ssgW0yHzpOmDOSFEzpS8EJ
	MrUTAHr1Q+dnzbZ1WtHt+K83sNdp9Gcg=
X-Google-Smtp-Source: AGHT+IGSwFkpn8md6DrPvbrEu1U0QdNk3aUasW3xWTUdEwmk8GNUKmRmgbqP0HdWP0odQfGStJVRwETqSfVia9SbpSA=
X-Received: by 2002:a05:622a:190a:b0:462:c96a:bb30 with SMTP id
 d75a77b69052e-4633ef60502mr927611cf.2.1731361279387; Mon, 11 Nov 2024
 13:41:19 -0800 (PST)
MIME-Version: 1.0
References: <20241111205506.3404479-1-surenb@google.com>
In-Reply-To: <20241111205506.3404479-1-surenb@google.com>
From: Suren Baghdasaryan <surenb@google.com>
Date: Mon, 11 Nov 2024 13:41:08 -0800
Message-ID: <CAJuCfpER+Er8PAGVh2ScN70g267n4iuSukEifMS4929yVqv4xg@mail.gmail.com>
Subject: Re: [PATCH 0/4] move per-vma lock into vm_area_struct
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, 
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, 
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, 
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, 
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, 
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, 
	souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, 
	linux-kernel@vger.kernel.org, kernel-team@android.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Stat-Signature: sbhde4eir6meqdrk7utb87qsaz6nusie
X-Rspam-User: 
X-Rspamd-Queue-Id: E78B7180017
X-Rspamd-Server: rspam02
X-HE-Tag: 1731361239-873448
X-HE-Meta: U2FsdGVkX1+QvV63PNu7p1Gq/tQXMZ0+rYwMFblJMglLiwS/u+gZeHm8098CXK9CVdjR8LIvReFyYvdovRIou3ffP4KhgxMOjBHfWQZww46pDx204+zlH0fwGr+G9Pqw/N65TqSqvXtCtJ9/jAUdPAVBCuw/xt1BH1cPTYYbFfRm8/ph3032Yo8X8oBCgHXUTiTTAl5lmw11KmxEE2PcmZrOPmlEGINby6ahkCeiM1qamBoare4AwReRSyhozdqwGyVJfEfP0pRxScoJ7RNjYGCbaR1mTGv6Fh9jSVo4pVPRsRzOz6jomt7Ypy1b9640fsbZEy/ONg3oBKo+au2+2qqo3ukWH1CAHGudD/FdGebXOXHsDzcnYT/Jp0Kt2MF/ry3Nw7RHbVxJl8A77TpWOJEbkSykqkkhiEpz92/tY5Bx+MH21I5Ph0ucSkq00Y/McdFlWcnWZWkOSvbQBLFRHblD2e6zDUjl3RIje9qguWSYSXhAixkUaResp2wYkWUDB7O/LCqca3kGAiVTMEAXuexbZe/Zx+hktQ7+LbneYxv0jb+xISCJesJd0kInn9lc7Jiqfk9IoKb1iFgdg2K3MQ9L9K9j21YatahBWtZREAFdWy6z8odcSzb1oABTxm3QAvp3/SLh2lc+BDccJz8PHBMp+xAl2QBX0VcS982WsccDkNzfjsJUWme8e+fBAR1MksXhcaDygZDc/QR447SG9xJxbt/cZbIMVsvZ2/foCCrJ10GiFUshLknzhKqISHHKy9xq+Zag6e4+5rpioWSS2RwpsfGFDZ0zi+ScfWpNrGBnV+qHMhC8Mk+7U2EmTLD77CMzeO1O6j96jpWDT+y/vKqwZLQzIwoUEigvH4k84pLmWrx+HAlbd3P79nSvW0eZCfDGqJAWb4PjfFtxxPQGM3298QiDJ8P5O2fK371e5mO6Wm4SNFxWYo3e9QxO6zB4ewiCjx3eyRJYQNI1f/M
 GhWPHwvV
 0UVtw2hX3JC6FdcTwLPG76RMxGnOyHK6h2v+NFbwvgSs6ZvZODOoZ2eCYNxhPDVm/7z7vJt3/i2wnWobmCeehqJg3ZQpw8lXUMD1GEeaSbLsZrfF8C7WX5RfqXyQLHvSHtujOOc5shpjcCjlXGeCW+pUDR9mfPb8mWSWWFi2M9yyNP6lDRxuTOOHsbXXixs1vWCcM5iYuuSKzj0JTnnX1rggEE08QzWw765RnvoDNG29g1PxakiUgtz2mOQWASl3kdwzJZG/lcUad8W9lDrBKBcB/8TC+XpAvja9ovxjmzxjrkjMomd9P1QUk+g==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

On Mon, Nov 11, 2024 at 12:55=E2=80=AFPM Suren Baghdasaryan <surenb@google.=
com> wrote:
>
> Back when per-vma locks were introduces, vm_lock was moved out of
> vm_area_struct in [1] because of the performance regression caused by
> false cacheline sharing. Recent investigation [2] revealed that the
> regressions is limited to a rather old Broadwell microarchitecture and
> even there it can be mitigated by disabling adjacent cacheline
> prefetching, see [3].
> This patchset moves vm_lock back into vm_area_struct, aligning it at the
> cacheline boundary and changing the cache to be cache-aligned as well.
> This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40
> (vm_lock) bytes to 256 bytes:
>
>     slabinfo before:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vma_lock         ...     40  102    1 : ...
>      vm_area_struct   ...    160   51    2 : ...
>
>     slabinfo after moving vm_lock:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vm_area_struct   ...    256   32    2 : ...
>
> Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages,
> which is 5.5MB per 100000 VMAs.
> To minimize memory overhead, vm_lock implementation is changed from
> using rw_semaphore (40 bytes) to an atomic (8 bytes) and several
> vm_area_struct members are moved into the last cacheline, resulting
> in a less fragmented structure:
>
> struct vm_area_struct {
>         union {
>                 struct {
>                         long unsigned int vm_start;      /*     0     8 *=
/
>                         long unsigned int vm_end;        /*     8     8 *=
/
>                 };                                       /*     0    16 *=
/
>                 struct callback_head vm_rcu ;            /*     0    16 *=
/
>         } __attribute__((__aligned__(8)));               /*     0    16 *=
/
>         struct mm_struct *         vm_mm;                /*    16     8 *=
/
>         pgprot_t                   vm_page_prot;         /*    24     8 *=
/
>         union {
>                 const vm_flags_t   vm_flags;             /*    32     8 *=
/
>                 vm_flags_t         __vm_flags;           /*    32     8 *=
/
>         };                                               /*    32     8 *=
/
>         bool                       detached;             /*    40     1 *=
/
>
>         /* XXX 3 bytes hole, try to pack */
>
>         unsigned int               vm_lock_seq;          /*    44     4 *=
/
>         struct list_head           anon_vma_chain;       /*    48    16 *=
/
>         /* --- cacheline 1 boundary (64 bytes) --- */
>         struct anon_vma *          anon_vma;             /*    64     8 *=
/
>         const struct vm_operations_struct  * vm_ops;     /*    72     8 *=
/
>         long unsigned int          vm_pgoff;             /*    80     8 *=
/
>         struct file *              vm_file;              /*    88     8 *=
/
>         void *                     vm_private_data;      /*    96     8 *=
/
>         atomic_long_t              swap_readahead_info;  /*   104     8 *=
/
>         struct mempolicy *         vm_policy;            /*   112     8 *=
/
>
>         /* XXX 8 bytes hole, try to pack */
>
>         /* --- cacheline 2 boundary (128 bytes) --- */
>         struct vma_lock       vm_lock (__aligned__(64)); /*   128     4 *=
/
>
>         /* XXX 4 bytes hole, try to pack */
>
>         struct {
>                 struct rb_node     rb (__aligned__(8));  /*   136    24 *=
/
>                 long unsigned int  rb_subtree_last;      /*   160     8 *=
/
>         } __attribute__((__aligned__(8))) shared;        /*   136    32 *=
/
>         struct vm_userfaultfd_ctx  vm_userfaultfd_ctx;   /*   168     0 *=
/
>
>         /* size: 192, cachelines: 3, members: 17 */
>         /* sum members: 153, holes: 3, sum holes: 15 */
>         /* padding: 24 */
>         /* forced alignments: 3, forced holes: 2, sum forced holes: 12 */
> } __attribute__((__aligned__(64)));
>
> Memory consumption per 1000 VMAs becomes 48 pages, saving 2 pages compare=
d
> to the 50 pages in the baseline:
>
>     slabinfo after vm_area_struct changes:
>      <name>           ... <objsize> <objperslab> <pagesperslab> : ...
>      vm_area_struct   ...    192   42    2 : ...
>
> Performance measurements using pft test on x86 do not show considerable
> difference, on Pixel 6 running Android it results in 3-5% improvement in
> faults per second.
>
> [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.c=
om/
> [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/
> [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_k=
bfP_pR+-2g@mail.gmail.com/

And of course I forgot to update Lorenzo's new locking documentation :/
Will add that in the next version.

>
> Suren Baghdasaryan (4):
>   mm: introduce vma_start_read_locked{_nested} helpers
>   mm: move per-vma lock into vm_area_struct
>   mm: replace rw_semaphore with atomic_t in vma_lock
>   mm: move lesser used vma_area_struct members into the last cacheline
>
>  include/linux/mm.h        | 163 +++++++++++++++++++++++++++++++++++---
>  include/linux/mm_types.h  |  59 +++++++++-----
>  include/linux/mmap_lock.h |   3 +
>  kernel/fork.c             |  50 ++----------
>  mm/init-mm.c              |   2 +
>  mm/userfaultfd.c          |  14 ++--
>  6 files changed, 205 insertions(+), 86 deletions(-)
>
>
> base-commit: 931086f2a88086319afb57cd3925607e8cda0a9f
> --
> 2.47.0.277.g8800431eea-goog
>