From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 931C8EB64D7 for ; Fri, 16 Jun 2023 20:37:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF8188E0003; Fri, 16 Jun 2023 16:37:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AA85B8E0001; Fri, 16 Jun 2023 16:37:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 970278E0003; Fri, 16 Jun 2023 16:37:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8411C8E0001 for ; Fri, 16 Jun 2023 16:37:50 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4AE06140655 for ; Fri, 16 Jun 2023 20:37:50 +0000 (UTC) X-FDA: 80909772300.07.F20B42D Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf09.hostedemail.com (Postfix) with ESMTP id 77249140019 for ; Fri, 16 Jun 2023 20:37:48 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=DVxOyhIR; spf=pass (imf09.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686947868; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CZtYKUYFQrkQDBQ9u7ew1tV5fcVnXrA+zSqGXZsih2w=; b=dPCInsIeYj/h3Fv6FQRJNaet65wstpkqLZ4mDzXzSYGiNDw0SgvcpEOvs4x/6zuFHNAyHv NNaz3RQoY6JCnhEKB8i8y1geU/ca0DYUT6GOxJzSyQY627oAssOMK3Nm+6jRzf97IHfups DKoex2SvgniislyjS5B3yof80sv69E4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1686947868; a=rsa-sha256; cv=none; b=01ne5cCKOGGQC8Cz93ZnnGtIzTI9sWtFk/qzxbxBgCUwnQqwkkr3tx2Bsixs5eBKgxqCUL TI0wRR76gWCBVXFXaUfFLCletZBRsVENwL/oO4KanJWt5BeJcn+x8ML0TZ4vZbBmarA357 jYWS0RUM8UrC1jSfYmCw3MZh0uon0n0= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=DVxOyhIR; spf=pass (imf09.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com; dmarc=pass (policy=none) header.from=efficios.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1686947867; bh=lV56SIFY+NaaZGQLbwvxwxZ65X0Qrs1nSQfoFKDC76k=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=DVxOyhIRHEXTGgYAqb2Nk1nIFJglMpSEFfiWCETrphaVMDdWpdr/quCnlBRuE0Emz /OKC17eXzDZZIRrNkOYGPeTzUeCW/D1HnkSU8M0s1PA4wrtEYe1MC0WoRIP9Hw26Eg e8jDr5DLMxddVPRUeIdrkwoa4cO2i+P1BuQW9Ckf1HrHTV06+2ssznjSzUzmx1Lti8 Dd5qNB/xYF7oXC9wqxtNQRFc+G6TrRWjO1mNZigEmGYLk9doXR604HqKTcN9VJphx4 fPCsujidzidOmd8ULo121p6eZMtHYMyA5qUFVT87M1iBlKFNOVQzoKqC9U94T8kxQt ZX96zZfQZTYvg== Received: from [172.16.0.134] (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4QjWF70w6dz18YK; Fri, 16 Jun 2023 16:37:47 -0400 (EDT) Message-ID: Date: Fri, 16 Jun 2023 16:38:05 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] mm: Move mm_count into its own cache line Content-Language: en-US To: Andrew Morton Cc: Peter Zijlstra , linux-kernel@vger.kernel.org, kernel test robot , Aaron Lu , Olivier Dion , michael.christie@oracle.com, Feng Tang , John Hubbard , Jason Gunthorpe , Peter Xu , linux-mm@kvack.org References: <20230515143536.114960-1-mathieu.desnoyers@efficios.com> <20230616131639.992998157fe696eb0e0589aa@linux-foundation.org> From: Mathieu Desnoyers In-Reply-To: <20230616131639.992998157fe696eb0e0589aa@linux-foundation.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 77249140019 X-Rspam-User: X-Stat-Signature: krnrty9hjp943a7ywhbhp6chrag958mh X-Rspamd-Server: rspam03 X-HE-Tag: 1686947868-281571 X-HE-Meta: U2FsdGVkX1+572pjstB5rZgUKYhKQpHu2yyuqoCY5uStwueVxRsIG2XNcD5y94tc3Yub8fJRQyhIx7WhP2lmRTRxsg6IbBfI8nJq1B/PjnnVUKTXDFyd1N8q19dhp2WGsthid6TicG3PlP9XqrvkJhXZ9o/SaJ5VAkncmIl9To+rC3RSiZ/1XoGWV7w2IrPWymZ0XtZek31z38OSlZPu2MZXyq3mb2IukCBnyl7ByObD14F90hdGN+7IuCXEYyD8BbK20NRMMlO90dfiFq8ivcIK7nvS0UnQv0Q9qIPG1+FdC3nooii8BuDYmCLoKUy8i7xuKuuf4n1RSYQ39xNWy6446YxBcmaS1DOiZIrjckqalj8Qo8ZeShBoqbo8bidKhWIJ37trE6/2aL7BPiRiPK1/4IzVSNT1YnjeKJxz9/4cfxAAPX+SwPJdYbwpu1ozwzjAgj4ayNoeTeC2m9zAVa1gFB9YEs7AnzuqqAUg7ZJi6JK3Py6RWFoaXG62gbxZTy0Dz38Beh8IdYgK0UtQtvEEj7XFPtEaedxgQsR9ZDC4IOOiOeg+nPCCZ5EsRuU27dHTyCcBBiClE04nzCv0P122JBbd0qfL/rj3Vz7n1BQHDEWDvCGzmxC+OI+EnpMKcNyDTrALzK/UwOSEox7ONGwuBu/rUG6Ii9qryykFMgclULzKBwXznh+eZOya5Bf5X1G7qbB65q/JKJWivAWdtGAfHS4WrS2JlNr+toqT2lXUv4MHrTy+FVluFI9JA6NaYtlN5JSZXlRdYai4RLGCZPRxaWezWRdvsV/8rnkkuHP9G9e2qIxVA5EGXrj4bRpEVrjlBqcwrK78BN1hlO0El6A2y1wYRvZmiQGKOmS2YbTrckzXxnnjXu/Mp6AfkYZ0wVkDzpPg78F0FFS3+AFLV0ccn79haac5b1tfMyt1l2+Uc670NQszYWi8I82rN/wU5LBLPfiPK0svR45PDYK 8P3YwMWS m9C6Ig507geTduHf4BQSEgku9duXHG85QZK9f13ovXfkFZI+DVrwl9vKozkMh0H2WofFTGXV5ZVRR7M1YXfpjA83bN6vwcWwBdYrkFxsoQLJYvTDvbmLxBrrI4Dc547ZoVa0APJe83jMGLSsbiDIRJ4H/LJ7iHDGg52Ji43phQqBicC+ToJExmLAFV1pE35VWrgK/uYoAP5P7ZdkWseMxK4E2wExesfIfrQK1G1fXgm67Uq9Hf0VaxX2RRmK/NyLAkSYwewwWMcjJE1qTTk0ocHadNhsisxjoJoMZ8T6VI5UlX5pqSxBeQef2zg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/16/23 16:16, Andrew Morton wrote: > On Mon, 15 May 2023 10:35:36 -0400 Mathieu Desnoyers wrote: > >> The mm_struct mm_count field is frequently updated by mmgrab/mmdrop >> performed by context switch. This causes false-sharing for surrounding >> mm_struct fields which are read-mostly. >> >> This has been observed on a 2sockets/112core/224cpu Intel Sapphire >> Rapids server running hackbench, and by the kernel test robot >> will-it-scale testcase. >> >> Move the mm_count field into its own cache line to prevent false-sharing >> with other mm_struct fields. >> >> Move mm_count to the first field of mm_struct to minimize the amount of >> padding required: rather than adding padding before and after the >> mm_count field, padding is only added after mm_count. >> >> Note that I noticed this odd comment in mm_struct: >> >> commit 2e3025434a6b ("mm: relocate 'write_protect_seq' in struct mm_struct") >> >> /* >> * With some kernel config, the current mmap_lock's offset >> * inside 'mm_struct' is at 0x120, which is very optimal, as >> * its two hot fields 'count' and 'owner' sit in 2 different >> * cachelines, and when mmap_lock is highly contended, both >> * of the 2 fields will be accessed frequently, current layout >> * will help to reduce cache bouncing. >> * >> * So please be careful with adding new fields before >> * mmap_lock, which can easily push the 2 fields into one >> * cacheline. >> */ >> struct rw_semaphore mmap_lock; >> >> This comment is rather odd for a few reasons: >> >> - It requires addition/removal of mm_struct fields to carefully consider >> field alignment of _other_ fields, >> - It expresses the wish to keep an "optimal" alignment for a specific >> kernel config. >> >> I suspect that the author of this comment may want to revisit this topic >> and perhaps introduce a split-struct approach for struct rw_semaphore, >> if the need is to place various fields of this structure in different >> cache lines. >> >> ... >> >> --- a/include/linux/mm_types.h >> +++ b/include/linux/mm_types.h >> @@ -583,6 +583,21 @@ struct mm_cid { >> struct kioctx_table; >> struct mm_struct { >> struct { >> + /* >> + * Fields which are often written to are placed in a separate >> + * cache line. >> + */ >> + struct { >> + /** >> + * @mm_count: The number of references to &struct >> + * mm_struct (@mm_users count as 1). >> + * >> + * Use mmgrab()/mmdrop() to modify. When this drops to >> + * 0, the &struct mm_struct is freed. >> + */ >> + atomic_t mm_count; >> + } ____cacheline_aligned_in_smp; >> + > > Why add the anonymous struct? > > atomic_t mm_count ____cacheline_aligned_in_smp; > > would suffice? The anonymous struct is needed to ensure the mm_count is alone in its own cacheline. An aligned attribute applied to an integer field only aligns the offset of that field in the structure, without changing its size. An aligned attribute applied to a structure aligns both its offset in the structure containing it _and_ extends its size with padding. This takes care of adding padding _after_ mm_count as well. Alignment on a structure requires that the structure size is extended with padding to cover the required alignment. Otherwise an array of that structure could not have each of its items on a multiple of the required alignment. > > Secondly, the ____cacheline_aligned_in_smp doesn't actually do > anything? mm_count is at offset 0 which is cacheline aligned anyway. > The next field (mm_mt) will share a cacheline with mm_count. If applied on the integer field, I agree that it would not do anything. However, applied on the anonymous structure, it takes care of adding padding _after_ the mm_count field, which is exactly what we want here. > > If the plan is to put mm_count in "its own" cacheline then padding will > be needed? It's taken care of by the anonymous structure trick. Here is an quick example showing the difference between alignment attribute applied to an integer type vs to an anonymous structure: #include struct s1 { int a __attribute__((aligned(32))); int b; }; struct s2 { int a; int b __attribute__((aligned(32))); }; struct s3 { struct { int a; } __attribute__((aligned(32))); int b; }; int main() { struct s1 t1; struct s2 t2; struct s3 t3; printf("%d %d\n", (unsigned long)&t1.a - (unsigned long)&t1, (unsigned long)&t1.b - (unsigned long)&t1); printf("%d %d\n", (unsigned long)&t2.a - (unsigned long)&t2, (unsigned long)&t2.b - (unsigned long)&t2); printf("%d %d\n", (unsigned long)&t3.a - (unsigned long)&t3, (unsigned long)&t3.b - (unsigned long)&t3); return 0; } Result: 0 4 0 32 0 32 Applying the aligned attribute on the integer field would require explicit padding, because it does not add padding after the field. Applying the aligned attribute on the _following_ field would work, but it's rather odd and error-prone. Applying the aligned attribute on an anonymous structure clearly documents the intent, and adds the padding that guarantees this variable is alone in its cache line. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com