From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6101AD743F3 for ; Thu, 21 Nov 2024 00:33:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDCBC6B0085; Wed, 20 Nov 2024 19:33:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D8BFD6B0088; Wed, 20 Nov 2024 19:33:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C53156B0089; Wed, 20 Nov 2024 19:33:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A745C6B0085 for ; Wed, 20 Nov 2024 19:33:51 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 54E1781150 for ; Thu, 21 Nov 2024 00:33:51 +0000 (UTC) X-FDA: 82808226774.07.3D96669 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf30.hostedemail.com (Postfix) with ESMTP id 000C980016 for ; Thu, 21 Nov 2024 00:32:10 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ak5tEiTK; spf=pass (imf30.hostedemail.com: domain of surenb@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732149137; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UlniKgym+Q90GvuHd3GoZB7/NP0pryZ3yXdMK0RESg4=; b=pL51VByEzYPVa7yBVpRUE8MNq8WA7HuOzs0gWW1AtP+GU4u3pYiuvvrKHwwvwnX9rhJl2J zCaLDNlhZRSRHn4UQSQweAAtRJOll1r/+pofcD77GOsk1MSa4pr+YAHT0ImM7MLOeaaQG6 AAqNwxRmGGpb4vUe07pUKnnaVATR2dE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732149137; a=rsa-sha256; cv=none; b=N3jgqNzIWCwpwB2UYxGR3hMIwkAaFQ2l3Nf9TLlgXbYkcKMtUJPYzSzThom4EJVCMNq3Ym uBEKktY9qlHbZQRNsNgVwkF9AMjcsZ5R9jDSGE5EDgdnoSTsHTe7+5wRI1Op4XKBFlotXI b0ddq9iuJQios0B2qaQsB1hdTacIUP4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ak5tEiTK; spf=pass (imf30.hostedemail.com: domain of surenb@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-4608dddaa35so161421cf.0 for ; Wed, 20 Nov 2024 16:33:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1732149228; x=1732754028; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UlniKgym+Q90GvuHd3GoZB7/NP0pryZ3yXdMK0RESg4=; b=ak5tEiTKQ0bj3GOZflWo6awu7bHPg8PfQ4Uk7eTWOaVGTfloVaU+5ziwHwRJ6jgSPn qqEjKFamAIsJmUMQPydtCJj9GCqwgodgeN2L3sfLU2vnvlnaxx1wjCKs1vIdzZD+d46o 7ZD53+qjTrPdq+zDAmi7ZAKUlJDjCqrnZ5JDoF1aGG4IsAlZUCgGa00Y3gPqxtcsQLfk H/A5dFSD9/Rfl3RtIdCpk/OzpOBrpZJ6m3I9rRaIYG0RrXQqY8ZxT26wG9vnObE3Pxr5 DMzzQKiDBQnssgQtmh93M8s0pSR+k6Wk+ti6HNX4QKmW4hEZlCqoPT8G5MH8P7RITbZ4 VFLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732149228; x=1732754028; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UlniKgym+Q90GvuHd3GoZB7/NP0pryZ3yXdMK0RESg4=; b=X6yC9cYAfomQJlgs40dyGYLgXrQZxgzz7/PgMYxuskLPS1p6upSF9KM1/zQVsyD1Tw DqaXkhWUanXxU/qIAjR0bZc06AOMyIphvORvemUiTghgkEZw1ELXMU/5WOfmjNsar3WJ aDHX2xYBsE7s8T8Zb0Bw3niM5OZkSLsgW9i3dQVwAgD2zny+AiXX7wTNXkWLh2YKoJUp gl3dqjnhNa7B4knz5FnpJXruAnMvZiCSAnG9+dza2btZ/3gxCrym3CA+fX7tjLgrDukb RRu97MjAit+O3/LzsnVozsJkPKRHDLXsiPwzyXLd2zJWnjtHyiV2atDtOKn9GbyKeZ5n TO9A== X-Forwarded-Encrypted: i=1; AJvYcCVu8Fqg5hwaOitbckSXcTkMeCO0RmE2XotLgjBxxdPalt8rvZ/uUHu6Jb7rL9WJ83s/2AMBvYaPQw==@kvack.org X-Gm-Message-State: AOJu0Yz3EZZYucCRPDDUyXZAshoI08kQLLPcLW5WamLUms/ZL94jwWz/ 9N5Mdz0JPdtfo6eBnSqo23+yiDn0n6WMedtBaBrKIloimPrTbk7PppBOZ4BNilYGl5sLe2CM5/f /FqrtMLpvvFsAMRcK6+Fw1BEw3NrCgi3ys5we X-Gm-Gg: ASbGncvEexK7qvI25fGA0ck5hRpCmt3sh189ZksA4V13OX5bz/FmhZyJG8frVOp5Fga doAiuGNehlgvNajS1wzMtBtxTTKpLoE8= X-Google-Smtp-Source: AGHT+IFQ/4vehjWSFCMPc5HZk4dt8N+0X4OiwXr6PNMh/c4OX08XmHwTEq8nAVQsyDRLqHOEJlqtef+ZYaSBMLoAvgU= X-Received: by 2002:ac8:5811:0:b0:461:66ea:ea70 with SMTP id d75a77b69052e-465318e24c8mr659041cf.15.1732149228248; Wed, 20 Nov 2024 16:33:48 -0800 (PST) MIME-Version: 1.0 References: <20241120000826.335387-1-surenb@google.com> <20241120000826.335387-3-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 20 Nov 2024 16:33:37 -0800 Message-ID: Subject: Re: [PATCH v4 2/5] mm: move per-vma lock into vm_area_struct To: Shakeel Butt Cc: akpm@linux-foundation.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam10 X-Stat-Signature: g8uprpbt5kwgn445p3rut8e5rwuqw947 X-Rspamd-Queue-Id: 000C980016 X-Rspam-User: X-HE-Tag: 1732149130-21372 X-HE-Meta: U2FsdGVkX19pkYTDch9L8AbKzG2xAqvkcLVG47Z6f9XCZIjVXsgr2NCSVjN67Hy4K8YMzdn3Tvp2ZNhJT61gPhNzl3s+DW+5aHhf/mzrMWUQToQlnOi+EmNM+TEMQe04noQQjWBgJKwYIcXjRW8mk8hALmdJjzd4TKc5MgCVwqykds6Oe55f3o/CPZd8xbh4bes9quM4Jm0dRbrK5wasxHrISDBa5CpQ9T4z6iylpbev+Gant5Hi0ZD1GPcns3Uxm5+RO2HQpw2VFkmB/S7CbdFuSjdq9OAyvDfF4wKFg4r2SNKUw2v5g9Ay6O2YFpaI+p92BazCNYNU+8ESPfguq4ZrvJRsokWLMGfK47OdZKNipKNsEXJ6t/ttNJGlVaKqxmFL3EXumwxi9DXdJDlmeWDTZ/DmXdX6fbsgBvR0NigQw03nfwX5UIBZ/yzU/BUNuEuRuOisEDNotrwUkU9z+tSgCfsrPGqvD5MbOgB/eOMfZfLg7d/HLDsQypN8kb5lQjWai/pGLO33iGPlP447DVjC9WRwrtFiEPVYNghyiX6UT1TeozXX5UrCjNjiGKEd3Kn4whUqAT3ssdRpZ6YrPFp42xeOs7Y0xlHakkB4/DE2E3ssXx3wYNRQuXKQvdicXlv1nZ0dAWNHSXti6FxWAKypJUjK5x5J8opTLYvYCCtOUvG8+0dj5MlI4UoOKVryytbgicDVEJwOx2PWJ+/E3tr23PUPFdrZ8X4mVMzl1Nb5yY4G5KjmYw61A+KqqVU3CBgDdQNA4yYObzVn3jSa3ljNrXR7Q9EWSUA3xtmpqQ8gyqzMz2QQadFA1pIEJaO8K6Jp8lF6jPBIxiLOCbVKKnPDswv/w/SdA9zusfZFe2b7QjmMVpesX+u/3AhP8wKScHlIx9chsAnOI6WFCoahe+J9Dfcsnno0GFxR+iN/gFHgp091etPVFA9II79/xhnFHI/1qgJSEI8EJ/jdF6y GuFy1JAc cZmoLnFX0fHgZlArsJXqLvXv4X1EXTN3bIe9pwBPkg6cHeRgU3EJA1ZbThrrWvUsutgFeRqZ0T6W9WCebCtPVrcYPVdt+9am/gXRsTwGVdyQ8327xOU6tOExJtKBYHn2Uv0QKbnQWf1KobsWn3R4Cn+rw4VG9t/r6mRfEGYp2/QAOLHgazUx3jqrcT8kp2XKeZu5j40vwO9tNgD9TtT6i1cYfEgHxhq9rkGl6qhvzfuNPupPhXbsSgeAeJ/++HhqjpZCxpwS9WYT0p8bfkC1ZjI2wIgl5f3wwWfa3b4Ejm9uzHTG3Q4oQuEUYIiHxbsal0BIIe6laNXgI5SIxVOJIdXqTcQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Nov 20, 2024 at 4:05=E2=80=AFPM Shakeel Butt wrote: > > On Wed, Nov 20, 2024 at 03:44:29PM -0800, Suren Baghdasaryan wrote: > > On Wed, Nov 20, 2024 at 3:33=E2=80=AFPM Shakeel Butt wrote: > > > > > > On Tue, Nov 19, 2024 at 04:08:23PM -0800, Suren Baghdasaryan wrote: > > > > Back when per-vma locks were introduces, vm_lock was moved out of > > > > vm_area_struct in [1] because of the performance regression caused = by > > > > false cacheline sharing. Recent investigation [2] revealed that the > > > > regressions is limited to a rather old Broadwell microarchitecture = and > > > > even there it can be mitigated by disabling adjacent cacheline > > > > prefetching, see [3]. > > > > Splitting single logical structure into multiple ones leads to more > > > > complicated management, extra pointer dereferences and overall less > > > > maintainable code. When that split-away part is a lock, it complica= tes > > > > things even further. With no performance benefits, there are no rea= sons > > > > for this split. Merging the vm_lock back into vm_area_struct also a= llows > > > > vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. > > > > Move vm_lock back into vm_area_struct, aligning it at the cacheline > > > > boundary and changing the cache to be cacheline-aligned as well. > > > > With kernel compiled using defconfig, this causes VMA memory consum= ption > > > > to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes= : > > > > > > > > slabinfo before: > > > > ... : .= .. > > > > vma_lock ... 40 102 1 : ... > > > > vm_area_struct ... 160 51 2 : ... > > > > > > > > slabinfo after moving vm_lock: > > > > ... : .= .. > > > > vm_area_struct ... 256 32 2 : ... > > > > > > > > Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 = pages, > > > > which is 5.5MB per 100000 VMAs. Note that the size of this structur= e is > > > > dependent on the kernel configuration and typically the original si= ze is > > > > higher than 160 bytes. Therefore these calculations are close to th= e > > > > worst case scenario. A more realistic vm_area_struct usage before t= his > > > > change is: > > > > > > > > ... : .= .. > > > > vma_lock ... 40 102 1 : ... > > > > vm_area_struct ... 176 46 2 : ... > > > > > > > > Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 = pages, > > > > which is 3.9MB per 100000 VMAs. > > > > This memory consumption growth can be addressed later by optimizing= the > > > > vm_lock. > > > > > > > > [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@go= ogle.com/ > > > > [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9= 020/ > > > > [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP= 53cT_kbfP_pR+-2g@mail.gmail.com/ > > > > > > > > Signed-off-by: Suren Baghdasaryan > > > > Reviewed-by: Lorenzo Stoakes > > > > > > Reviewed-by: Shakeel Butt > > > > Thanks! > > > > > > > > > > > One question below. > > > > > > > --- a/include/linux/mm_types.h > > > > +++ b/include/linux/mm_types.h > > > > @@ -716,8 +716,6 @@ struct vm_area_struct { > > > > * slowpath. > > > > */ > > > > unsigned int vm_lock_seq; > > > > - /* Unstable RCU readers are allowed to read this. */ > > > > - struct vma_lock *vm_lock; > > > > #endif > > > > > > > > /* > > > > @@ -770,6 +768,10 @@ struct vm_area_struct { > > > > struct vma_numab_state *numab_state; /* NUMA Balancing sta= te */ > > > > #endif > > > > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; > > > > +#ifdef CONFIG_PER_VMA_LOCK > > > > + /* Unstable RCU readers are allowed to read this. */ > > > > + struct vma_lock vm_lock ____cacheline_aligned_in_smp; > > > > +#endif > > > > } __randomize_layout; > > > > > > Do we just want 'struct vm_area_struct' to be cacheline aligned or do= we > > > want 'struct vma_lock vm_lock' to be on a separate cacheline as well? > > > > We want both to minimize cacheline sharing. > > > > For later, you will need to add a pad after vm_lock as well, so any > future addition will not share the cacheline with vm_lock. I would do > something like below. This is a nit and can be done later. > > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 7654c766cbe2..5cc4fff163a0 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -751,10 +751,12 @@ struct vm_area_struct { > #endif > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; > #ifdef CONFIG_PER_VMA_LOCK > + CACHELINE_PADDING(__pad1__); > /* Unstable RCU readers are allowed to read this. */ > - struct vma_lock vm_lock ____cacheline_aligned_in_smp; > + struct vma_lock vm_lock; > + CACHELINE_PADDING(__pad2__); > #endif > -} __randomize_layout; > +} __randomize_layout ____cacheline_aligned_in_smp; I thought SLAB_HWCACHE_ALIGN for vm_area_cachep added in this patch would have the same result, no? > > #ifdef CONFIG_NUMA > #define vma_policy(vma) ((vma)->vm_policy)