From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B20F5D3ABF4 for ; Mon, 11 Nov 2024 21:41:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34C406B00B8; Mon, 11 Nov 2024 16:41:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FC1D6B00BA; Mon, 11 Nov 2024 16:41:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 14EDE6B00BC; Mon, 11 Nov 2024 16:41:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C14326B00B8 for ; Mon, 11 Nov 2024 16:41:22 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 71EA514113A for ; Mon, 11 Nov 2024 21:41:22 +0000 (UTC) X-FDA: 82775132790.20.5351969 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf16.hostedemail.com (Postfix) with ESMTP id E78B7180017 for ; Mon, 11 Nov 2024 21:40:39 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hxemGVGx; spf=pass (imf16.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731361226; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=; b=jDHsvZeGO4rFgvIRngEEHdE0qQwnwwQAp6Q+k4F9Azclv3k0UyMWlZxr5NUmcEYfxfnDsv N3540Tq7monErTCw7GFYdrByWvHzQMwCicwW9t74yPrn7P+bpT0q5t19S7eZMExGskJlRD gDTmegNRlxA/eWECkpdFSUb4uzKk2J4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731361226; a=rsa-sha256; cv=none; b=JwY+5k/XfdmE7OdhH907HGtDpdaz+go0f8LEAabZXwOjbBYfBi2Lw7onWIN2tjPA/ScwYG hkhFlGO2I9F2FnWMgxfiKn4/mfiFEQ769AYT7Qmtk9W7WNn2gdnkC8VHh2Mz2k2Sx6298N qqT3gZXi9FX98301V4ha7qmbpgjne8g= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=hxemGVGx; spf=pass (imf16.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4608dddaa35so84501cf.0 for ; Mon, 11 Nov 2024 13:41:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731361280; x=1731966080; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=; b=hxemGVGx2u7yp71lCKXcBax45F+EnRLy1tM1ZGDkZ3xc2kJJK2hJdXiapE5Y7O9StM Lti480Ip7JBKqTWJenbO4kMczCg8JqvT6xLchhMA3n0glkoI/Su1qZ3juiuscXVCb41a K47ZaA8CZXpkZxcHGlaQEX7pj04YqwPoPrl6tfjCKkWRidLnuQdiTT9FB0PDGgm4o+4O kMDmi+YC46gT36iEZsQJ/z2X2VSnzrVgBjkT0XUAhNbYPv9VpwDyzUEqfNsFVnnrmmJM jjdFZXRQlwRR1z734d+lGd8O8paIATc5ylHGmAh296ErY2tyjOOAFocgVd624jdeWX6f bIHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731361280; x=1731966080; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0o1VajD5WDcGAgvLwqsAPBGDplXjntm8i6Rb+8zCTPc=; b=PtlLBgk/4W1eIuwpBBAOP0QppZwX7/Qc9nuL/obeztmWiPTX35CdT9DGaqACWfU4sO 3hAkAE1RVc02zObQNMpPJjswY5Y6yc/Zg/i8EKpHGYxhbKQip0NRF480qBDkp8k+e9rJ pB0ZG6wa16+oLeVo3qvlufcOPFwpheT4XeGiIqvX7x70L4khhVVt4w03gMtEkae/+xa5 Zf/7KIgpRprBWoLJCYIB8VZK8AoxmgXHvP+ivk+kEos3DfyoAO1d3U56wZzuVcmgBJCV DjJY2tnorlOeefkwVlCxciL9EJGE3o4VaKzG4zfu+DzNZnMhAtFF6RAz/Hpt6X3Xpdr0 OpdQ== X-Forwarded-Encrypted: i=1; AJvYcCUrBL50U+FUBy23Ni98iYYW0ePLvWmr36UqLO0RX0/pV3AYSohDZvpW1CtQI28gYhhxxRGI7YSWeQ==@kvack.org X-Gm-Message-State: AOJu0YzBXqoc/UR3RwRHCga3JMSfK3NxVaqgExsmi2nmhDKJ3U1p0Eyy BVlFdz4u+RPLIYEW4zQ6ljPM3Znt1jnU/7MErc9xK2C9eaQU3DfH2tOLgMjDj7LlPsI36lOzdyw 6EG2q0OOqoY4+ud4sM4+KKf+AKT2WLp3IfUIn X-Gm-Gg: ASbGnct+b62xiBxzu+WlgsYAypWyEObum4rvij4p++pz7ssgW0yHzpOmDOSFEzpS8EJ MrUTAHr1Q+dnzbZ1WtHt+K83sNdp9Gcg= X-Google-Smtp-Source: AGHT+IGSwFkpn8md6DrPvbrEu1U0QdNk3aUasW3xWTUdEwmk8GNUKmRmgbqP0HdWP0odQfGStJVRwETqSfVia9SbpSA= X-Received: by 2002:a05:622a:190a:b0:462:c96a:bb30 with SMTP id d75a77b69052e-4633ef60502mr927611cf.2.1731361279387; Mon, 11 Nov 2024 13:41:19 -0800 (PST) MIME-Version: 1.0 References: <20241111205506.3404479-1-surenb@google.com> In-Reply-To: <20241111205506.3404479-1-surenb@google.com> From: Suren Baghdasaryan Date: Mon, 11 Nov 2024 13:41:08 -0800 Message-ID: Subject: Re: [PATCH 0/4] move per-vma lock into vm_area_struct To: akpm@linux-foundation.org Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: sbhde4eir6meqdrk7utb87qsaz6nusie X-Rspam-User: X-Rspamd-Queue-Id: E78B7180017 X-Rspamd-Server: rspam02 X-HE-Tag: 1731361239-873448 X-HE-Meta: U2FsdGVkX1+QvV63PNu7p1Gq/tQXMZ0+rYwMFblJMglLiwS/u+gZeHm8098CXK9CVdjR8LIvReFyYvdovRIou3ffP4KhgxMOjBHfWQZww46pDx204+zlH0fwGr+G9Pqw/N65TqSqvXtCtJ9/jAUdPAVBCuw/xt1BH1cPTYYbFfRm8/ph3032Yo8X8oBCgHXUTiTTAl5lmw11KmxEE2PcmZrOPmlEGINby6ahkCeiM1qamBoare4AwReRSyhozdqwGyVJfEfP0pRxScoJ7RNjYGCbaR1mTGv6Fh9jSVo4pVPRsRzOz6jomt7Ypy1b9640fsbZEy/ONg3oBKo+au2+2qqo3ukWH1CAHGudD/FdGebXOXHsDzcnYT/Jp0Kt2MF/ry3Nw7RHbVxJl8A77TpWOJEbkSykqkkhiEpz92/tY5Bx+MH21I5Ph0ucSkq00Y/McdFlWcnWZWkOSvbQBLFRHblD2e6zDUjl3RIje9qguWSYSXhAixkUaResp2wYkWUDB7O/LCqca3kGAiVTMEAXuexbZe/Zx+hktQ7+LbneYxv0jb+xISCJesJd0kInn9lc7Jiqfk9IoKb1iFgdg2K3MQ9L9K9j21YatahBWtZREAFdWy6z8odcSzb1oABTxm3QAvp3/SLh2lc+BDccJz8PHBMp+xAl2QBX0VcS982WsccDkNzfjsJUWme8e+fBAR1MksXhcaDygZDc/QR447SG9xJxbt/cZbIMVsvZ2/foCCrJ10GiFUshLknzhKqISHHKy9xq+Zag6e4+5rpioWSS2RwpsfGFDZ0zi+ScfWpNrGBnV+qHMhC8Mk+7U2EmTLD77CMzeO1O6j96jpWDT+y/vKqwZLQzIwoUEigvH4k84pLmWrx+HAlbd3P79nSvW0eZCfDGqJAWb4PjfFtxxPQGM3298QiDJ8P5O2fK371e5mO6Wm4SNFxWYo3e9QxO6zB4ewiCjx3eyRJYQNI1f/M GhWPHwvV 0UVtw2hX3JC6FdcTwLPG76RMxGnOyHK6h2v+NFbwvgSs6ZvZODOoZ2eCYNxhPDVm/7z7vJt3/i2wnWobmCeehqJg3ZQpw8lXUMD1GEeaSbLsZrfF8C7WX5RfqXyQLHvSHtujOOc5shpjcCjlXGeCW+pUDR9mfPb8mWSWWFi2M9yyNP6lDRxuTOOHsbXXixs1vWCcM5iYuuSKzj0JTnnX1rggEE08QzWw765RnvoDNG29g1PxakiUgtz2mOQWASl3kdwzJZG/lcUad8W9lDrBKBcB/8TC+XpAvja9ovxjmzxjrkjMomd9P1QUk+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 11, 2024 at 12:55=E2=80=AFPM Suren Baghdasaryan wrote: > > Back when per-vma locks were introduces, vm_lock was moved out of > vm_area_struct in [1] because of the performance regression caused by > false cacheline sharing. Recent investigation [2] revealed that the > regressions is limited to a rather old Broadwell microarchitecture and > even there it can be mitigated by disabling adjacent cacheline > prefetching, see [3]. > This patchset moves vm_lock back into vm_area_struct, aligning it at the > cacheline boundary and changing the cache to be cache-aligned as well. > This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 > (vm_lock) bytes to 256 bytes: > > slabinfo before: > ... : ... > vma_lock ... 40 102 1 : ... > vm_area_struct ... 160 51 2 : ... > > slabinfo after moving vm_lock: > ... : ... > vm_area_struct ... 256 32 2 : ... > > Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, > which is 5.5MB per 100000 VMAs. > To minimize memory overhead, vm_lock implementation is changed from > using rw_semaphore (40 bytes) to an atomic (8 bytes) and several > vm_area_struct members are moved into the last cacheline, resulting > in a less fragmented structure: > > struct vm_area_struct { > union { > struct { > long unsigned int vm_start; /* 0 8 *= / > long unsigned int vm_end; /* 8 8 *= / > }; /* 0 16 *= / > struct callback_head vm_rcu ; /* 0 16 *= / > } __attribute__((__aligned__(8))); /* 0 16 *= / > struct mm_struct * vm_mm; /* 16 8 *= / > pgprot_t vm_page_prot; /* 24 8 *= / > union { > const vm_flags_t vm_flags; /* 32 8 *= / > vm_flags_t __vm_flags; /* 32 8 *= / > }; /* 32 8 *= / > bool detached; /* 40 1 *= / > > /* XXX 3 bytes hole, try to pack */ > > unsigned int vm_lock_seq; /* 44 4 *= / > struct list_head anon_vma_chain; /* 48 16 *= / > /* --- cacheline 1 boundary (64 bytes) --- */ > struct anon_vma * anon_vma; /* 64 8 *= / > const struct vm_operations_struct * vm_ops; /* 72 8 *= / > long unsigned int vm_pgoff; /* 80 8 *= / > struct file * vm_file; /* 88 8 *= / > void * vm_private_data; /* 96 8 *= / > atomic_long_t swap_readahead_info; /* 104 8 *= / > struct mempolicy * vm_policy; /* 112 8 *= / > > /* XXX 8 bytes hole, try to pack */ > > /* --- cacheline 2 boundary (128 bytes) --- */ > struct vma_lock vm_lock (__aligned__(64)); /* 128 4 *= / > > /* XXX 4 bytes hole, try to pack */ > > struct { > struct rb_node rb (__aligned__(8)); /* 136 24 *= / > long unsigned int rb_subtree_last; /* 160 8 *= / > } __attribute__((__aligned__(8))) shared; /* 136 32 *= / > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; /* 168 0 *= / > > /* size: 192, cachelines: 3, members: 17 */ > /* sum members: 153, holes: 3, sum holes: 15 */ > /* padding: 24 */ > /* forced alignments: 3, forced holes: 2, sum forced holes: 12 */ > } __attribute__((__aligned__(64))); > > Memory consumption per 1000 VMAs becomes 48 pages, saving 2 pages compare= d > to the 50 pages in the baseline: > > slabinfo after vm_area_struct changes: > ... : ... > vm_area_struct ... 192 42 2 : ... > > Performance measurements using pft test on x86 do not show considerable > difference, on Pixel 6 running Android it results in 3-5% improvement in > faults per second. > > [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.c= om/ > [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ > [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_k= bfP_pR+-2g@mail.gmail.com/ And of course I forgot to update Lorenzo's new locking documentation :/ Will add that in the next version. > > Suren Baghdasaryan (4): > mm: introduce vma_start_read_locked{_nested} helpers > mm: move per-vma lock into vm_area_struct > mm: replace rw_semaphore with atomic_t in vma_lock > mm: move lesser used vma_area_struct members into the last cacheline > > include/linux/mm.h | 163 +++++++++++++++++++++++++++++++++++--- > include/linux/mm_types.h | 59 +++++++++----- > include/linux/mmap_lock.h | 3 + > kernel/fork.c | 50 ++---------- > mm/init-mm.c | 2 + > mm/userfaultfd.c | 14 ++-- > 6 files changed, 205 insertions(+), 86 deletions(-) > > > base-commit: 931086f2a88086319afb57cd3925607e8cda0a9f > -- > 2.47.0.277.g8800431eea-goog >