From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16D39EB64DA for ; Fri, 16 Jun 2023 22:35:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0B9F56B0074; Fri, 16 Jun 2023 18:35:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 06A306B0075; Fri, 16 Jun 2023 18:35:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E25D48E0001; Fri, 16 Jun 2023 18:35:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D0CC26B0074 for ; Fri, 16 Jun 2023 18:35:27 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9C6CB40D73 for ; Fri, 16 Jun 2023 22:35:27 +0000 (UTC) X-FDA: 80910068694.14.1A9E731 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2059.outbound.protection.outlook.com [40.107.223.59]) by imf17.hostedemail.com (Postfix) with ESMTP id 7439A40010 for ; Fri, 16 Jun 2023 22:35:24 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=Vzu4QyNA; spf=pass (imf17.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.223.59 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1686954924; a=rsa-sha256; cv=pass; b=XOSgVyVooqcqGrBEbLAxeoSnjBWA0OP3V0d+P27i7EbjCr3hpNrHJIuu+OwB2zRWx3rqVw NKrmZEOnl1gvJovTce49LSH5ry+TDsH9fU+U0IAP2xqrICLbIFBoz4rpDts2iPbHmW2CiE cAttZJlY/rnOl3yQ56cJaOQAbEVaqGA= ARC-Authentication-Results: i=2; imf17.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=Vzu4QyNA; spf=pass (imf17.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.223.59 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1686954924; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aEuitt7bL/w5/varS4Wz5VGVewe1MJg1FxoB2i95x3I=; b=ot46YPFCgFlLSy6Rx6i331qdTbcHlrZMNOB5PbIoRIhrRbBVgdwtRJR3xI3Wysa0ZcoudP HOPtiNuZNZeQIk6wHBMqeejktGUceUcFIVKl5GlfOOM1fy74wnXE5MBJMLEJtVcBar45CG a584U3rGUnvaA7qcgOJCZtQD6BS4DOo= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MggcmxyIXqHjgHI01Wor/1cdcGlFWLii4ZtUHf7Me+FukRpY2p2xgVCDP9h7nyazovW5aBWtWzThnaHYv2MSgRDJKRy2xVKmg1fkXCLbsi1Z8jSvUrH3PvbMkzX1mEM2QIVoS/vjpt5eC64mHOguVCM5R/3VTKATWdvGb9oU1sOyjW+x+ZcNiBs257o1MAurKQShav88T4cOxtR9qCTzB7cIQIaMOXg3toKMYtXB6AUyvdt7vWsUEdQd945YqgGzx2UXqZ401WdzUmKl+O8gT2cwfpJNrPQViEeTtjAf9cLKi/egB/IPeQSXB5HBEFPl71cHsmoFBLlQXucqMMH8Ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aEuitt7bL/w5/varS4Wz5VGVewe1MJg1FxoB2i95x3I=; b=fWL/BmiIzjbuNa8HX5CIjatB8ESpnqzcsIFSDXp5D6XgBu9lvmt/JPcEP3/KQzuwDxagjrAzvqW2iMFwUix8NJ5MfHTsvOmFt8My6UgtHEc9QNzrYlaKTXWQd78RGTFJoiU5Q8LqCV3ls5vqykVTcZA6BNRxTz09ChdiF7wNClHRuypO2lAoVfS6EZRi6XFWH3UYMOBkTBmgEuDZnwjK6qUa93WE+li/M95j+V3nsQ0oRNncRykEQbSaGXIKxIeRJDQn5DZaIJHDduV/8FdIb11Kn7vqiDQTTc4qmmdyRZVxxHPLz7UcJRalsgA4BFdDbttHZXvPMblXIY1L0EwPfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=efficios.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aEuitt7bL/w5/varS4Wz5VGVewe1MJg1FxoB2i95x3I=; b=Vzu4QyNAHRTe1TwGVTuL5PCosmY93s8QfUo5UYV3s3NWQXnStfaDHLTylDnxOHIyA5hGsC+UO8lBcC5IQFjDkwFCbSKHkuIuYjGa5dK7lAEX6thBZAppUkw92S8WQLATX+i+pPt0VPdwtDK+i0a0Ahx+N88DLmuNc2CxeIxywz+P42pzh44yvMqWdFi9FlBMouCUSk1wRy4qDw59ab/uoZq/YY1Ic9GMp3MTSklhlJl0rYMdcMPKXfRlfzOxj5RSGnRNamy86Mn153wpls/jWcZKoQGe2q9mI+7Q8esxqtwAk8iGUT3rfszgmgY/yULcmZuES5+0JMEt2rpBavppbQ== Received: from DM6PR03CA0002.namprd03.prod.outlook.com (2603:10b6:5:40::15) by MW4PR12MB6682.namprd12.prod.outlook.com (2603:10b6:303:1e3::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37; Fri, 16 Jun 2023 22:35:20 +0000 Received: from MWH0EPF000971E4.namprd02.prod.outlook.com (2603:10b6:5:40:cafe::81) by DM6PR03CA0002.outlook.office365.com (2603:10b6:5:40::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37 via Frontend Transport; Fri, 16 Jun 2023 22:35:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by MWH0EPF000971E4.mail.protection.outlook.com (10.167.243.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.27 via Frontend Transport; Fri, 16 Jun 2023 22:35:18 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Fri, 16 Jun 2023 15:35:05 -0700 Received: from [10.110.48.28] (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Fri, 16 Jun 2023 15:35:04 -0700 Message-ID: <1634ca8f-2b22-712e-15f9-9980ba8a4e64@nvidia.com> Date: Fri, 16 Jun 2023 15:35:04 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] mm: Move mm_count into its own cache line To: Mathieu Desnoyers , Andrew Morton CC: Peter Zijlstra , , kernel test robot , Aaron Lu , Olivier Dion , , Feng Tang , Jason Gunthorpe , Peter Xu , References: <20230515143536.114960-1-mathieu.desnoyers@efficios.com> <20230616131639.992998157fe696eb0e0589aa@linux-foundation.org> Content-Language: en-US From: John Hubbard In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000971E4:EE_|MW4PR12MB6682:EE_ X-MS-Office365-Filtering-Correlation-Id: 96e87d0c-b17f-4592-c67b-08db6eb9f69d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3la9EgBRCXZor9YEhcQXr+JgFvysgUkznykVKPEQI6PvxuEYDwpl4cPvvzU3EM7GSD/uDR6dtVrvv4Cpkd7zphrInQvZ/wPhoqJLHRgSQJIwsqalBNkM6mC0R4fT2SHFKn6X2RMlKRzk1SexwrTMmRY3dGI0Hg3/ZMLpYm8eQZFC40VJV41gFBXTZNdZTWUJdc0Odk3s+u24pG2f+QaknlAbYRgz6pYxF4708irSrYl0xplD3YFfEW7c4R9kxqIrLQAKR+144+Sq1rndWDI5uafbOxRF3RJGeOS0I6IoLRwlzi/imcGak2NEe1Q538kZb3CpJKPkHFFyZp4cb5MZESrxGSG5Te4rxEDIawrVAmfeObC3o2HYu4vUKiQP55cAgUgzsmiwV6AgeTgcEmfAhN1ZBptb3PgBJfakXDAKUwTOGfcMnq74w4InIcI1k8GNWrVd4SQ94sNbuf1ixIPkIfzoLgxqATJyGl0+H9ePuIGckS1sP/mfh9nM5VYKV7V6j9gtOfhJ04YUf1a4qzLMUviQjE+/VVtMGXCPw3SiVjD28jx1Sp3bQtFuOE+PCc3W8Yrw8AyXiIhl5pFzZbFIh3leYpMDyDHl4mayOCPInTkc9+9OJ22nPcEn7D+J1pcwlLIWbMbINNZxuWidpJaktEgG1uoCPxjGv/sEwRJKv8S9SM3H5NKapn44k6RyDICXHxmXP2PRfpXjkXIs7OUXd9CRrxC6Awfp9/FtZrhoUlHjotjiu+M3twCijUQ1pT2mXO89b36rwtWsVv1DGzLvqg== X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(39860400002)(376002)(396003)(346002)(136003)(451199021)(36840700001)(40470700004)(46966006)(7636003)(82740400003)(356005)(40460700003)(40480700001)(478600001)(16576012)(110136005)(54906003)(5660300002)(316002)(7416002)(41300700001)(8676002)(8936002)(70586007)(2906002)(70206006)(83380400001)(4326008)(47076005)(426003)(82310400005)(26005)(36860700001)(336012)(186003)(16526019)(53546011)(2616005)(31696002)(36756003)(86362001)(31686004)(43740500002);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 22:35:18.7659 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 96e87d0c-b17f-4592-c67b-08db6eb9f69d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000971E4.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB6682 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7439A40010 X-Stat-Signature: wgzj4i6sn7qsm4fppkhg5b6jwtzh4u3q X-Rspam-User: X-HE-Tag: 1686954924-151306 X-HE-Meta: U2FsdGVkX18roZsd3YxN21wr0gcqOoBkNYmKmrauZp+nDmQPOeUdLca7dWJYkdFHQ0PxiHCn/+A4ftRh51HzN7ZFuxflKtAHzPrSkLPz1E2roXuXr/CgV1qbWsobgO6ormH5WGqCznX9Auz5uTnk7O9qs+/xQgXEO/VW5gObhim80l705HoGYF+P0NPXal3oYy3kbS9url0VSh9AIWq2HHKjxGFDVcJWA9k37Sr53Yg/hxN4nA8kAV1WZ+hcvv5QnLDYvLJpXWLTFv/3OXPaKBFpKoWSBVYirXNxFcIbcDmpV37e309oFZlukr0JBY2Rh6KRkVMKnP51/Ej4gw7ulpoSgZOLZOBtZ7BdwVdgH+hfhpLb6/yUx/UWxZA7jH7s+e43wb1oNBYEzOBaH+hu3rYqmVLiITSX4CnxA172b5GcXeAcMNzYqLlrrjG8fmu+cuX8hL55M1UFVKyKbFmilO6ZosK+W7wS35lyufZ+MYiDKADPdXKl+z0xn0ZLPYyx+rJwHrKZXgloPg8R0a5zhgGRb0pnzGqWmcO4OpsJZ8R0RxcaZjlmnq074S59TTcOC40jZgUD7c1gpGxuVYEseCGEdo/vYoZb9IFkb1ZqOUgKbAcVTRiRKFEMq2x+Zb6twTOwtkIW2XatRzWtSqxIZf9kXxkfKP4aHYrkDQhLAmSor3xtUVWpc0OM3uYuGoef04/vxqawvzBXIklk14YRQJqgLoR1K5jPNLCJgkF5BMZvVP8h/q9+QoC5TJMbSDzVbumyfKAzrapP6uFt5aRmrkzv4uuFQqfdyboHckSgLYHJzmfBdsrAS1xj1nX5Svd54yt9tgzOTtkxT9CcnCOu8OH3xVC11mzK8o95l2mp+Uqka7u4VF3Fhq4OSlG2P0ViHJENfj3tbsLhLrpfMVg2Hv8bl9Yki3ecg9eyOfIafhiFfhAdkjIAQgIBSpGoVRP/796DYywFcTk9vy9CLRF qheZWIat tQRMtQ00Kp5MH5ZYl9+XA0VLqJfasLFsT47MzXeKNyguMmGNTm40xk5pNxnzSQkFYmq0vUUMXApdm6thtt8eLxDh3soeIG2pwMpBgetSC5BKXCapI5mdAd6Xhbidnri4y8pe4pwMDTviiIAl7fuqIQQESWGS2wdnIWBjtp5iLgWWaLzBI946gIWKWIRKkYttMgAinWPj2lRC9scCem8JU1IrqzqzuUQMkObpc7+jAYDtLYa2U567l5TWkZreakwfUh3HoAgc/05XLnLvVoRs6pjTq1x1uFxuER2IoMiXM2prPAaXtqnAz1OHQQOLFcIgEP8+CUSlK1llifaM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/16/23 13:38, Mathieu Desnoyers wrote: ... >>> This comment is rather odd for a few reasons: >>> >>> - It requires addition/removal of mm_struct fields to carefully consider >>>    field alignment of _other_ fields, >>> - It expresses the wish to keep an "optimal" alignment for a specific >>>    kernel config. >>> >>> I suspect that the author of this comment may want to revisit this topic >>> and perhaps introduce a split-struct approach for struct rw_semaphore, >>> if the need is to place various fields of this structure in different >>> cache lines. >>> Agreed. The whole thing is far too fragile, but when reviewing this I wasn't sure what else to suggest. Now looking at it again with your alignment suggestion, there is an interesting conflicting set of desires: a) Here: Feng Tang discovered that .count and .owner are best put in separate cache lines for the contended case for mmap_lock, and b) rwsem.h, which specifies precisely the opposite for the uncontended case: * For an uncontended rwsem, count and owner are the only fields a task * needs to touch when acquiring the rwsem. So they are put next to each * other to increase the chance that they will share the same cacheline. I suspect that overall, it's "better" to align rw_semaphore's .count and .owner field so that the lock is optimized for the contended case, because it's reasonable to claim that the benefit of having those two fields in the same cacheline for the uncontended case is far less than the cost to the contended case, of keeping them close to each other. However, it's still not unlikely that someone will measure a performance regression if such a change is made. Thoughts? ... >> If the plan is to put mm_count in "its own" cacheline then padding will >> be needed? > > It's taken care of by the anonymous structure trick. Here is an quick example showing the difference between alignment attribute applied to an integer type vs to an anonymous structure: Thanks for explaining very clearly how that works, that's really helpful! thanks, -- John Hubbard NVIDIA