From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFBFDE77188 for ; Thu, 9 Jan 2025 02:32:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FF896B00A2; Wed, 8 Jan 2025 21:32:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AEB66B00B2; Wed, 8 Jan 2025 21:32:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 228786B00B3; Wed, 8 Jan 2025 21:32:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 020276B00A2 for ; Wed, 8 Jan 2025 21:32:44 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id B229D1615B4 for ; Thu, 9 Jan 2025 02:32:44 +0000 (UTC) X-FDA: 82986340248.06.D599F8B Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf20.hostedemail.com (Postfix) with ESMTP id CD17F1C0007 for ; Thu, 9 Jan 2025 02:32:42 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mZoi2EsN; spf=pass (imf20.hostedemail.com: domain of surenb@google.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736389962; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kHrbsakMUyUhUtfojeMHFZoQ67su4b13YWHH4z/fGBI=; b=BOSNIFbzrXjiPecMC75V69NWm3C1oPu7KtgEX4qtWB8GVjw+hyEr8BlpIZAJajsNDZfX4s gUeRXV0N5wlOsBH7jXjPxkvWXE4+EySXJihPV0iMyuBnEpTIGK/KFel0EcZ2J1scDIlGhB byQm63PPHjo8YjIzTsbCeNyG3aZQWo8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736389962; a=rsa-sha256; cv=none; b=q+3LKqqRTHk1qLqjmXLmYW+tA/d6HMAVwN4NugVANqw2UK49nsSUwmuQFF4+9Q29981OYL wW0dHsOpl63QoFgH1N5qTJRHZqZ3aDM7PaZ8b6ON9u0WC5YOJHvuyjycxgmKRauxN+4KCd GnGi1yXj0dW6KjE63vN+tSD02PYg1/U= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mZoi2EsN; spf=pass (imf20.hostedemail.com: domain of surenb@google.com designates 209.85.160.177 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f177.google.com with SMTP id d75a77b69052e-4678c9310afso79301cf.1 for ; Wed, 08 Jan 2025 18:32:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736389962; x=1736994762; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kHrbsakMUyUhUtfojeMHFZoQ67su4b13YWHH4z/fGBI=; b=mZoi2EsN6FppP1uDEzKdI0zueRSEaRkWOaCGcU72l4dtLfvFEBLHsLfd1wCX7FNuhs zRMDJvZXn1s0UW3dtcOuk0zzQDP8/KSEFcKgn1ZsGcQeQqk1Z7rguwbN43dXRflR1PPS Zy+qIYCjRPq/MFeRyQkbqrttdSORwEUJ4eSEQXiXykHnM10Dh6kiKS9HjisV1KVog20j a0iNA7F37eo1Q12jwH/jXRn/V2Diw2L5QQQEIKmjqlzCzG4Q36hjhEHbsB/GTYhMWwp3 bFB90SyBjtRZ8twkJwFri52B/n8FOyN4ZcAprPzji8rZfGCtOtVZLNmw2p3WHXD7b4mI eQFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736389962; x=1736994762; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kHrbsakMUyUhUtfojeMHFZoQ67su4b13YWHH4z/fGBI=; b=QieuG408+YKajcAtCAEBcWxgKyLaX1ADcjwv2YUVXVj2ac2vqSPlFEO1U7QIUWaUmr rTARij0KXo7SptoKGWD68OO5m4EbNkr+3TVPyuZ22ma2CuFAfQ8lJgQoXJv0X2x0b83M M4jmwCZnEWA0MbLUeGW1+ygELX3zW5TwrPgP2FMyLbBoYkefTDu3xh/8eQuoZUKnCX0R i1iohC+ebLC4iNIlQNLCJTqWDpmoZuvsbPHi4wfqWqg4HlgIgvCrvQDPav1AvuNGCZ5R aAhjoZm/T2GiHF8sYPvqddoUM+J9HUPFzmPy7kQlTf+AVBTN4oV647FXR+KDIDcxu84C 8fiQ== X-Forwarded-Encrypted: i=1; AJvYcCUaD/b8BqsMnB3BvwVfAV4L9rfMOgRcI9sFNwIwJPvAej3smZVcCwxlMEnDkGXEe+/PUU0tHOYg4Q==@kvack.org X-Gm-Message-State: AOJu0Yy7iMRMgrF+FxU670+cieWHbUGyV2A8I6mDbqkhtznDphV/9wHm TdPE5dFtoGExw7Z/cpy5FoFcNyrZiiXGjjJv9LP8V7O0JZ9w1e4FYb7KDr5ZZDvZbzYVo6u2vYG 2p6XSvZFPdk95W0FwES2CTc4yTrHWeetcsSUQ X-Gm-Gg: ASbGnct1Qfr7S8otMCkWYnrT1bDaRp2kOhS5y5fM4twUQuv1GC7gkIxRGkBbXg1YtH8 iXPonMopzTTUZYvtkSRt207j9xuNmImXfX6Po3vMXUgaa70xf6+bwyagO2K966pHNv3c= X-Google-Smtp-Source: AGHT+IGQATD3rtk7a2/kU7mE1nJxm3MeXJmCuUfEH3kCePw/J6JcsC5zZ5VygDcpnO+vHBqn4e/KnRhFso4RxF/u8hQ= X-Received: by 2002:a05:622a:5cf:b0:467:8f1e:7304 with SMTP id d75a77b69052e-46c7cf3d965mr863061cf.13.1736389961559; Wed, 08 Jan 2025 18:32:41 -0800 (PST) MIME-Version: 1.0 References: <20250109023025.2242447-1-surenb@google.com> In-Reply-To: <20250109023025.2242447-1-surenb@google.com> From: Suren Baghdasaryan Date: Wed, 8 Jan 2025 18:32:29 -0800 X-Gm-Features: AbW1kvYkK7SZe4-TwmVQFLKcuQgZE5KKkcowOifh-d8LBFXGEuFb8-H-EYVLZCM Message-ID: Subject: Re: [PATCH v8 00/16] move per-vma lock into vm_area_struct To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, richard.weiyang@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: CD17F1C0007 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ohyup69cda8ccjbif3skeocr3snrozuw X-HE-Tag: 1736389962-684568 X-HE-Meta: U2FsdGVkX1+FeCOcye89SU1RhfEWsX3D/1n6VzmhAxxVQUD7/hSQcmFhaG1rIsMdfjg3Imh0QlKCn2ojsgyO4MOmHRUMKooIlhikm9HHY67Tz5RBF5jcG7ghcMTpOiO3npb7+N/3T1PaXu7KuC7Wuf111WDmQa+dDIDw5i4tOWO5bPi7MHjhs8XhMrxA014ie+rzlw5lB9X2itV/fdM7uj1NWFo8A9FWWf36oSV9OdXq5osWqs7v2B/KWYi4WLRICIOIbTrxed1QyaJb9/XuGgoBEiyOb5lPmVbhvR5eQwFyBDTB3sOZygTKtqt5/rIu7aKXdilx2t757ucKeQU56BIFPSX0KYCU9TCCa8jaHgeMX6zIbVeGL1FoYuKFKy3lkHE7TNXown4TkXTyO6iVdOzrnazyXsj59MXOHf9EvE3o0ATPjTl7o1yxrjyFgJO8/dY1P00v5/jeA2d9rGOzsigr70VlMeymIjfZT9bhy1XZ9HQLGLciVjwlV0A/kdwcN48YaSKgyXgftfOCiGUTveFPBMtPlXbqsCqJZ5VLpeAojTkl193HNggTxJDYpICszePUmlhHlDkhj0sumQLn7HYsmAYN01X4bI6892CvAVecM/A1nV64xZABt47mbt7yPGHMhMGoGTQ4z1RH4TxCtwBBmt3QDof6tvs13Xo4Mwbz9kExoqiOJqMnurgSV9Dh17wAoZANx3MhwfgaWoQBmOVQIsS9M9R4QkIN1jmDRTw5q85PzlPFuC+KgDrC1k1pkU7a6LlPYNkmG/JSRUUjC6ZoxgRbvBFc+4tKDG46HV9LD7hr8N6wBlIpTH8PRB9Lmxf/rYo0pEjbrRyUXCytMOVU5Q8vWuenaq8WQg8Pyo4CgZdAyQFPyr87GU76NM5fELij2y3TLDrPZwYVkWruKK/aQEHPqsu8dX1ReVI3n7FgkHhiQzqo2YkdwKzatSJ9405Br5xTlWHchjmMEX7 yoS9k6o2 mi4SUAamSaHTX/cjrYstu+HtoZUkfXIrCrrUZeRhp3EUNDyJzlDkKqCB0c4jv9/aWKV9eS2KcuTtU42fFguQ3roVUeaOHVTDc8DN79EToI3pPqSoX/Oufb5UELRY4h+6w++N4MestOm+xDC4cRepYM0EVC4fXAMj682Y72BJRWCoYTUXwn/7GUvZcFafLKnCtnCJqSX6cxcijU65Nf6bPOb6nPEuftyumfZZXot+WBGoHCFIdoLezmv0qtEM3bTCSen1UnHRwwtyOoJ0WfwFXSF8VVNRLqNH6/v0LzoejJwmEIdEvOR3Snvm1IM97H/JwvMAaSrJfH6jHWH8NqeuUofqAaxnice+cVOThoF03AkSrEH1ViI+6BIwJpHQtBiVHi4FX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jan 8, 2025 at 6:30=E2=80=AFPM Suren Baghdasaryan wrote: > > Back when per-vma locks were introduces, vm_lock was moved out of > vm_area_struct in [1] because of the performance regression caused by > false cacheline sharing. Recent investigation [2] revealed that the > regressions is limited to a rather old Broadwell microarchitecture and > even there it can be mitigated by disabling adjacent cacheline > prefetching, see [3]. > Splitting single logical structure into multiple ones leads to more > complicated management, extra pointer dereferences and overall less > maintainable code. When that split-away part is a lock, it complicates > things even further. With no performance benefits, there are no reasons > for this split. Merging the vm_lock back into vm_area_struct also allows > vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. > This patchset: > 1. moves vm_lock back into vm_area_struct, aligning it at the cacheline > boundary and changing the cache to be cacheline-aligned to minimize > cacheline sharing; > 2. changes vm_area_struct initialization to mark new vma as detached unti= l > it is inserted into vma tree; > 3. replaces vm_lock and vma->detached flag with a reference counter; > 4. changes vm_area_struct cache to SLAB_TYPESAFE_BY_RCU to allow for thei= r > reuse and to minimize call_rcu() calls. > > Pagefault microbenchmarks show performance improvement: > Hmean faults/cpu-1 507926.5547 ( 0.00%) 506519.3692 * -0.28%* > Hmean faults/cpu-4 479119.7051 ( 0.00%) 481333.6802 * 0.46%* > Hmean faults/cpu-7 452880.2961 ( 0.00%) 455845.6211 * 0.65%* > Hmean faults/cpu-12 347639.1021 ( 0.00%) 352004.2254 * 1.26%* > Hmean faults/cpu-21 200061.2238 ( 0.00%) 229597.0317 * 14.76%* > Hmean faults/cpu-30 145251.2001 ( 0.00%) 164202.5067 * 13.05%* > Hmean faults/cpu-48 106848.4434 ( 0.00%) 120641.5504 * 12.91%* > Hmean faults/cpu-56 92472.3835 ( 0.00%) 103464.7916 * 11.89%* > Hmean faults/sec-1 507566.1468 ( 0.00%) 506139.0811 * -0.28%* > Hmean faults/sec-4 1880478.2402 ( 0.00%) 1886795.6329 * 0.34%* > Hmean faults/sec-7 3106394.3438 ( 0.00%) 3140550.7485 * 1.10%* > Hmean faults/sec-12 4061358.4795 ( 0.00%) 4112477.0206 * 1.26%* > Hmean faults/sec-21 3988619.1169 ( 0.00%) 4577747.1436 * 14.77%* > Hmean faults/sec-30 3909839.5449 ( 0.00%) 4311052.2787 * 10.26%* > Hmean faults/sec-48 4761108.4691 ( 0.00%) 5283790.5026 * 10.98%* > Hmean faults/sec-56 4885561.4590 ( 0.00%) 5415839.4045 * 10.85%* > > Changes since v7 [4]: > - Removed additional parameter for vma_iter_store() and introduced > vma_iter_store_attached() instead, per Vlastimil Babka and > Liam R. Howlett > - Fixed coding style nits, per Vlastimil Babka > - Added Reviewed-bys and Acked-bys, per Vlastimil Babka > - Added Reviewed-bys and Acked-bys, per Liam R. Howlett > - Added Acked-by, per Davidlohr Bueso > - Removed unnecessary patch changeing nommu.c > - Folded a fixup patch [5] into the patch it was fixing > - Changed calculation in __refcount_add_not_zero_limited() to avoid > overflow, to change the limit to be inclusive and to use INT_MAX to > indicate no limits, per Vlastimil Babka and Matthew Wilcox > - Folded a fixup patch [6] into the patch it was fixing > - Added vm_refcnt rules summary in the changelog, per Liam R. Howlett > - Changed writers to not increment vm_refcnt and adjusted VMA_REF_LIMIT > to not reserve one count for a writer, per Liam R. Howlett > - Changed vma_refcount_put() to wake up writers only when the last reader > is leaving, per Liam R. Howlett > - Fixed rwsem_acquire_read() parameters when read-locking a vma to match > the way down_read_trylock() does lockdep, per Vlastimil Babka > - Folded vma_lockdep_init() into vma_lock_init() for simplicity > - Brought back vma_copy() to keep vm_refcount at 0 during reuse, > per Vlastimil Babka > > What I did not include in this patchset: > - Liam's suggestion to change dump_vma() output since it's unclear to me > how it should look like. The patch is for debug only and not critical for > the rest of the series, we can change the output later or even drop it if > necessary. > > [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.c= om/ > [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ > [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_k= bfP_pR+-2g@mail.gmail.com/ > [4] https://lore.kernel.org/all/20241226170710.1159679-1-surenb@google.co= m/ > [5] https://lore.kernel.org/all/20250107030415.721474-1-surenb@google.com= / > [6] https://lore.kernel.org/all/20241226200335.1250078-1-surenb@google.co= m/ > > Patchset applies over mm-unstable after reverting v7 > (current SHA range: 588f0086398e - fb2270654630) ^^^ Please note that to apply this patchset over mm-unstable you should revert the previous version. Thanks! > > Suren Baghdasaryan (16): > mm: introduce vma_start_read_locked{_nested} helpers > mm: move per-vma lock into vm_area_struct > mm: mark vma as detached until it's added into vma tree > mm: introduce vma_iter_store_attached() to use with attached vmas > mm: mark vmas detached upon exit > types: move struct rcuwait into types.h > mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail > mm: move mmap_init_lock() out of the header file > mm: uninline the main body of vma_start_write() > refcount: introduce __refcount_{add|inc}_not_zero_limited > mm: replace vm_lock and detached flag with a reference count > mm/debug: print vm_refcnt state when dumping the vma > mm: remove extra vma_numab_state_init() call > mm: prepare lock_vma_under_rcu() for vma reuse possibility > mm: make vma cache SLAB_TYPESAFE_BY_RCU > docs/mm: document latest changes to vm_lock > > Documentation/mm/process_addrs.rst | 44 +++++---- > include/linux/mm.h | 152 ++++++++++++++++++++++------- > include/linux/mm_types.h | 36 ++++--- > include/linux/mmap_lock.h | 6 -- > include/linux/rcuwait.h | 13 +-- > include/linux/refcount.h | 20 +++- > include/linux/slab.h | 6 -- > include/linux/types.h | 12 +++ > kernel/fork.c | 128 +++++++++++------------- > mm/debug.c | 12 +++ > mm/init-mm.c | 1 + > mm/memory.c | 94 +++++++++++++++--- > mm/mmap.c | 3 +- > mm/userfaultfd.c | 32 +++--- > mm/vma.c | 23 ++--- > mm/vma.h | 15 ++- > tools/testing/vma/linux/atomic.h | 5 + > tools/testing/vma/vma_internal.h | 93 ++++++++---------- > 18 files changed, 435 insertions(+), 260 deletions(-) > > -- > 2.47.1.613.gc27f4b7a9f-goog >