From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47087D3C52D for ; Thu, 17 Oct 2024 19:43:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C27546B0082; Thu, 17 Oct 2024 15:42:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD7686B0083; Thu, 17 Oct 2024 15:42:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC5D06B0089; Thu, 17 Oct 2024 15:42:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8D71C6B0082 for ; Thu, 17 Oct 2024 15:42:59 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 76726416C1 for ; Thu, 17 Oct 2024 19:42:52 +0000 (UTC) X-FDA: 82684117026.05.A16A85F Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) by imf03.hostedemail.com (Postfix) with ESMTP id 3C95220014 for ; Thu, 17 Oct 2024 19:42:52 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=q79wLeVz; spf=pass (imf03.hostedemail.com: domain of surenb@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729194032; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gmPmxgsJ9wsiU3T6ESSH/Le/i/U9lbL5KdA1wgnhHYk=; b=O6e8Eh03/UKo4eK8pv6olo/Rp0UBG1akOOxlRc4V+KKo38hZHMAmNOo2Mizobyd0ocHnpp 8gecjR1xro0GTnKtr11xTbJjyEbRXpUvlreW69PArizha1GGbKtBSCq4HMPCxlpEU624Jg ieIyIn7L3cDkrLKhgntap3MFWuU8UNU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729194032; a=rsa-sha256; cv=none; b=DsEmWSCXf+VS3hhjm1IxCp2AOPYO/U21PlJQ9HmI2v5WPBmU9AvPAiaF70BPilmlRA8yDU ELbGSPNT8XxyAApYfBD+XgNl3wDWhoaiDAQLGO0T8r/jjMpkk48/y1TOMmH7roaZrc4Y3m 9Gf8aSjcRy2tDxcEpR5M+WRZwIkKaJw= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=q79wLeVz; spf=pass (imf03.hostedemail.com: domain of surenb@google.com designates 209.85.160.178 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-4608dddaa35so62211cf.0 for ; Thu, 17 Oct 2024 12:42:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729194176; x=1729798976; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gmPmxgsJ9wsiU3T6ESSH/Le/i/U9lbL5KdA1wgnhHYk=; b=q79wLeVzeS6dJ2XUnZRp19PEGt+TqMIkSeGvhyDVa9aTnJLYJ0Q5PmvGeG+Gyr3M9r UPRSZVg9/B2AEtgSN3naEsWaVzbsnYvbmjbwVa5VCbOY4xZdtcGQTvuAOFVHWu6TW+e5 MD3rKKLHJi0UzVDOCqwIXvDbqkGSOoazwl1F1zq3eU3pUNRYwXmUWGiHoglnIOutrY++ 2lIPL1+KYJYeYvaBXYcZ8WxSRAWut5KndYm2ugAzJxg0/LFuzg67rDd/LaLNMeNQIHJx 0wYSSf1OPxa3HP/Q+iLBJf0HVtaHcq8/DofkChMPIelq3mUbxz3xgyfdQ/rU2+h75QRw 5wig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729194176; x=1729798976; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gmPmxgsJ9wsiU3T6ESSH/Le/i/U9lbL5KdA1wgnhHYk=; b=gpbr26ghePyLiyXvKyJR5mCO9UYnCL7s6wyNZAOFZYDmVwH9ROigtQpgHDvonv5VmL JfhoycfPHYZ/Y5cCQ05YvSiIBF7ClgdYjCt0sqG6mqL3JFKSaRmN/aBVy2MMG1pIHqR2 QOEpLIe+zmnoYFFKibIUqsOPpRFC4VmnP44hPXFgslRlAGeTCiSCcfMTUpwWVnuT5geS 7M0i1DPWwdIqHas07niXA+GYG1w3nt4HfZ3FhRuTjWEirWsNOTavYeeZlOTFRCv8Bidr QSEXn3dWIhhEwKHAMZ073tpoQswry4ASuCGZzlPFHcZnDMLIh8UbPQ/z8zJxlwr3Utn9 pIeQ== X-Forwarded-Encrypted: i=1; AJvYcCV1ciE9r6VkQvnIMdNdgFWZ6R9TBm45ntkyaRnpn7bfkOcTAsxw10xFpRpo3PM6FEta1GLtkPLo3g==@kvack.org X-Gm-Message-State: AOJu0YwRl9UamJgMUmEFRrjEOLEu5C1efZiiF4zin0TDo4GThy9ta2F+ ihEduGvmXMTkUwaU1s4ibwxRuxi0cb33wraH+FNuu+jo/norZ9JyGhsSUOwMeCKeGy3ZIqCcS52 Zv7T8XNMRIN3WOcaSfj+rr9EyU6XjFBaekqMP X-Google-Smtp-Source: AGHT+IFyzZKGqloaMJLDqC8BPSIt8wiOEv39Om9PqwGTXxDEcr6TbeLDGrSOdHvvf+bo6p+5daaGQOyTPHDjdm/zcRU= X-Received: by 2002:ac8:58d5:0:b0:460:3f4a:40a1 with SMTP id d75a77b69052e-460ada02566mr447411cf.13.1729194175847; Thu, 17 Oct 2024 12:42:55 -0700 (PDT) MIME-Version: 1.0 References: <20241010205644.3831427-1-andrii@kernel.org> <20241010205644.3831427-3-andrii@kernel.org> <55hskn2iz5ixsl6wvupnhx7hkzcvx2u4muswvzi4wuqplmu2uo@rj72ypyeksjy> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 17 Oct 2024 12:42:43 -0700 Message-ID: Subject: Re: [PATCH v3 tip/perf/core 2/4] mm: switch to 64-bit mm_lock_seq/vm_lock_seq on 64-bit architectures To: Andrii Nakryiko Cc: Shakeel Butt , Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org, oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3C95220014 X-Stat-Signature: ksi8gbdtkwh4t5fkmqyb3afak16mufu3 X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729194172-978364 X-HE-Meta: U2FsdGVkX1/kcvRUlc14vMkFCzvWwyaFfUjjQQPx2i7knjLOfOm3rfgP/9a5BUptqJqXsjiDG0eDPTeQqn55PaD8M6OOAezgQU2Shaje/h3slfnAJwTZq1d3soOtc+Hm797gJomuntN5OE73oWfX2jzjOMGw3AC1bxf8SLkkdcH3VPFr0q4QJtmpYYAQHHEKk9OcNC74vq2drpwYqHm6JBKxDegvA/Spmq03wGr6Nodow/M54iyLXc2woE8eq3Q1+qLO19nr6WeYxrkQOTl6IJXBwnaSOrhQxxXxVeTHKjpoY9HE588IjtlbH44/qf+muioUcGB46iyZakXyv4EMxbec8bcqZn13OM0lGf6bDZR9/+XY6uEj4IoJhQBvnPKlPTC769fdYQdeD8yKmP1xWzjayqS3ntnf8MozFPcXbz+jo1nfNN83j3afEieAnrvbPcj2/tcCDi4zIaxDT5/s78yr7WRpGf/IPc2Ef1nbVlc6WYdBzZr2W/4c5pTSPQs7sYECUcU1l1b8Mu9ORRX2ofQZASbmPPYKRDBdblXYwPSG7fQPVidQ1IqWzYrh4wvmeP+wlBCpV+ZQ2p4fZsz/IyJohL1TViB1MBhGEBL7UCWkp6noaOdNQ2oyVDl5E0e/VhHRu6WaI39ze7hDpkpmsKtuWq6hE90r8si9PPW9DLRduH3M7mnc3MH4w9GFvbG+oejtsZ75f4ohYCL7N0FLhQzvLTRfPq6yZRdv74sdADtwl8Cqu3Wlcsvh8XfTv1EC60UKilW22e45Ru9TqOSbM04bdkuvEsPmGk1CGq2YZc0Mw8BTI/U3Gn60NiQhtoGTRW1vMsL1QG/GrIayQ3QCe3Aiq4xf0IyIJ+L3p6H6byZBvp8/x0bntFTh0RBDRcDo9DFf0UWzCHuv8ehx+iIWLbcgV+TLRGck/QHevcxclYsQYda5QSIOBNFcmOUywTAEa/Jw5pFJwBJRGi+kkrh H5h8FI3c Eag+FYIDkCLT5xNZqJyF8Pi5eQOsiHB/7YU9X73raAV8aLZh9zeWxo4Vog9XoYgEu9C5PAzE2K6md6g/hcnEPdKtIBZProzAtYgzMTQceQpjkqK4SMvyMbrZjFH+Y+LDdKoJ7Y6pnsF6GEPwGkZV/zT7VAwe8ANja4Rxgmyg6yeNBbUzri85QLzaWH2Qh1nzGkNhRiz6ZspXheU/ZOQUT20qkwx7fNIno0TkaO3HXC6N7bY3rEh9YS1ubMU+u1CL2fCtIMuZa/Zw+WI6ZwiaccnNJbsnOQmKkXbaNfcYQaW/PzKO1bcnoHJYZ6wTg1Em2+VK7GHX6f93Wwge7ZEDmlvAsbVSK6Sr0e/6QwJ0IKycsIMzOMDDpkK55Zf1TI41nlFlB0IELO4cgWAQDv0GsQQsrwGaxCwWHiwk3T8Ma+eKncjrz8BP1+QJ7Dpy2HEcOALMi X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 17, 2024 at 11:55=E2=80=AFAM Andrii Nakryiko wrote: > > On Wed, Oct 16, 2024 at 7:02=E2=80=AFPM Suren Baghdasaryan wrote: > > > > On Sun, Oct 13, 2024 at 12:56=E2=80=AFAM Shakeel Butt wrote: > > > > > > On Thu, Oct 10, 2024 at 01:56:42PM GMT, Andrii Nakryiko wrote: > > > > To increase mm->mm_lock_seq robustness, switch it from int to long,= so > > > > that it's a 64-bit counter on 64-bit systems and we can stop worryi= ng > > > > about it wrapping around in just ~4 billion iterations. Same goes f= or > > > > VMA's matching vm_lock_seq, which is derived from mm_lock_seq. > > > > vm_lock_seq does not need to be long but for consistency I guess that > > How come, we literally assign vm_lock_seq from mm_lock_seq and do > direct comparisons. They have to be exactly the same type, no? Not necessarily. vm_lock_seq is a snapshot of the mm_lock_seq but it does not have to be a "complete" snapshot. Just something that has a very high probability of identifying a match and a rare false positive is not a problem (see comment in https://elixir.bootlin.com/linux/v6.11.3/source/include/linux/mm.h#L678). So, something like this for taking and comparing a snapshot would do: vma->vm_lock_seq =3D (unsigned int)mm->mm_lock_seq; if (vma->vm_lock_seq =3D=3D (unsigned int)mm->mm_lock_seq) > > > makes sense. While at it, can you please change these seq counters to > > be unsigned? > > There is `vma->vm_lock_seq =3D -1;` in kernel/fork.c, should it be > switched to ULONG_MAX then? In general, unless this is critical for > correctness, I'd very much like stuff like this to be done in the mm > tree afterwards, but it seems trivial enough, so if you insist I'll do > it. Yeah, ULONG_MAX should work fine here. vma->vm_lock_seq is initialized to -1 to avoid false initial match with mm->mm_lock_seq which is initialized to 0. As I said, a false match is not a problem but if we can avoid it, that's better. > > > Also, did you check with pahole if the vm_area_struct layout change > > pushes some members into a difference cacheline or creates new gaps? > > > > Just did. We had 3 byte hole after `bool detached;`, it now grew to 7 > bytes (so +4) and then vm_lock_seq itself is now 8 bytes (so +4), > which now does push rb and rb_subtree_last into *THE SAME* cache line > (which sounds like an improvement to me). vm_lock_seq and vm_lock stay > in the same cache line. vm_pgoff and vm_file are now in the same cache > line, and given they are probably always accessed together, seems like > a good accidental change as well. See below pahole outputs before and > after. Ok, sounds good to me. Looks like keeping both sequence numbers 64bit is not an issue. Changing them to unsigned would be nice and trivial but I don't insist. You can add: Reviewed-by: Suren Baghdasaryan > > That singular detached bool looks like a complete waste, tbh. Maybe it > would be better to roll it into vm_flags and save 8 bytes? (not that I > want to do those mm changes in this patch set, of course...). > vm_area_struct is otherwise nicely tightly packed. > > tl;dr, seems fine, and detached would be best to get rid of, if > possible (but that's a completely separate thing) Yeah, I'll take a look at that. Thanks! > > BEFORE > =3D=3D=3D=3D=3D=3D > struct vm_area_struct { > union { > struct { > long unsigned int vm_start; /* 0 8 *= / > long unsigned int vm_end; /* 8 8 *= / > }; /* 0 16 *= / > struct callback_head vm_rcu; /* 0 16 *= / > } __attribute__((__aligned__(8))); /* 0 16 *= / > struct mm_struct * vm_mm; /* 16 8 *= / > pgprot_t vm_page_prot; /* 24 8 *= / > union { > const vm_flags_t vm_flags; /* 32 8 *= / > vm_flags_t __vm_flags; /* 32 8 *= / > }; /* 32 8 *= / > bool detached; /* 40 1 *= / > > /* XXX 3 bytes hole, try to pack */ > > int vm_lock_seq; /* 44 4 *= / > struct vma_lock * vm_lock; /* 48 8 *= / > struct { > struct rb_node rb; /* 56 24 *= / > /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago -= -- */ > long unsigned int rb_subtree_last; /* 80 8 *= / > } /* 56 32 *= / > struct list_head anon_vma_chain; /* 88 16 *= / > struct anon_vma * anon_vma; /* 104 8 *= / > const struct vm_operations_struct * vm_ops; /* 112 8 *= / > long unsigned int vm_pgoff; /* 120 8 *= / > /* --- cacheline 2 boundary (128 bytes) --- */ > struct file * vm_file; /* 128 8 *= / > void * vm_private_data; /* 136 8 *= / > atomic_long_t swap_readahead_info; /* 144 8 *= / > struct mempolicy * vm_policy; /* 152 8 *= / > struct vma_numab_state * numab_state; /* 160 8 *= / > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; /* 168 8 *= / > > /* size: 176, cachelines: 3, members: 18 */ > /* sum members: 173, holes: 1, sum holes: 3 */ > /* forced alignments: 2 */ > /* last cacheline: 48 bytes */ > } __attribute__((__aligned__(8))); > > AFTER > =3D=3D=3D=3D=3D > struct vm_area_struct { > union { > struct { > long unsigned int vm_start; /* 0 8 *= / > long unsigned int vm_end; /* 8 8 *= / > }; /* 0 16 *= / > struct callback_head vm_rcu; /* 0 16 *= / > } __attribute__((__aligned__(8))); /* 0 16 *= / > struct mm_struct * vm_mm; /* 16 8 *= / > pgprot_t vm_page_prot; /* 24 8 *= / > union { > const vm_flags_t vm_flags; /* 32 8 *= / > vm_flags_t __vm_flags; /* 32 8 *= / > }; /* 32 8 *= / > bool detached; /* 40 1 *= / > > /* XXX 7 bytes hole, try to pack */ > > long int vm_lock_seq; /* 48 8 *= / > struct vma_lock * vm_lock; /* 56 8 *= / > /* --- cacheline 1 boundary (64 bytes) --- */ > struct { > struct rb_node rb; /* 64 24 *= / > long unsigned int rb_subtree_last; /* 88 8 *= / > } /* 64 32 *= / > struct list_head anon_vma_chain; /* 96 16 *= / > struct anon_vma * anon_vma; /* 112 8 *= / > const struct vm_operations_struct * vm_ops; /* 120 8 *= / > /* --- cacheline 2 boundary (128 bytes) --- */ > long unsigned int vm_pgoff; /* 128 8 *= / > struct file * vm_file; /* 136 8 *= / > void * vm_private_data; /* 144 8 *= / > atomic_long_t swap_readahead_info; /* 152 8 *= / > struct mempolicy * vm_policy; /* 160 8 *= / > struct vma_numab_state * numab_state; /* 168 8 *= / > struct vm_userfaultfd_ctx vm_userfaultfd_ctx; /* 176 8 *= / > > /* size: 184, cachelines: 3, members: 18 */ > /* sum members: 177, holes: 1, sum holes: 7 */ > /* forced alignments: 2 */ > /* last cacheline: 56 bytes */ > } __attribute__((__aligned__(8))); > > > > > > > > > > I didn't use __u64 outright to keep 32-bit architectures unaffected= , but > > > > if it seems important enough, I have nothing against using __u64. > > > > > > > > Suggested-by: Jann Horn > > > > Signed-off-by: Andrii Nakryiko > > > > > > Reviewed-by: Shakeel Butt > >