From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A75A7EB64D9 for ; Fri, 30 Jun 2023 01:54:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A1A48D0002; Thu, 29 Jun 2023 21:54:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 02B1A8D0001; Thu, 29 Jun 2023 21:54:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DBF488D0002; Thu, 29 Jun 2023 21:54:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C6F148D0001 for ; Thu, 29 Jun 2023 21:54:43 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9C8551C8FA6 for ; Fri, 30 Jun 2023 01:54:43 +0000 (UTC) X-FDA: 80957745246.09.47DDA36 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2083.outbound.protection.outlook.com [40.107.243.83]) by imf08.hostedemail.com (Postfix) with ESMTP id 77463160015 for ; Fri, 30 Jun 2023 01:54:39 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=s6MAIQiJ; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf08.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.243.83 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688090079; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7+fAGrbz5Zlv/ma+rYOPdweHu3Y3DnsGsQ3zoodr8FI=; b=RbgFZidwBcTsRTzc3TEPUzeiLn4+p0bNY+SVGMoqphkE+jLSHlEDr9Fr+v35mIshBGGEcU ImD9EdqqctjI5isYWi2dYnqW7b70xb3Nyc+paC+JRJ+xfUZi2h61YBMNbcUupmCVSlKSN+ eNH1FtkiZNhdMmQidC6p/zJj+izWCQE= ARC-Authentication-Results: i=2; imf08.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=s6MAIQiJ; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf08.hostedemail.com: domain of jhubbard@nvidia.com designates 40.107.243.83 as permitted sender) smtp.mailfrom=jhubbard@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1688090079; a=rsa-sha256; cv=pass; b=jR9Bn3xatY40OhDytYHybR43Cx14GRqBM9jTfI3qI11+GH+EgaetMcXp4sX4SjmiSCkpRX 9C6zPy9T1BImtdX7W4DmaxfoI2xlXjuYZW/QwxzGDX99Gx+YVRCzW4c2C2DUCzTmtMCsW3 AsWaA4QCBUfKAFIH2kCzWoZa4tshUww= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IHBOqyxNJQT6vbRQEkvdrdaQYN19q/Yl0qavhx8d0FGFb5UzsTufLtQvppKfAgc5C7cCI3RSS5BuXQZPm6L8KhezC4zDlc+u1rzohxKS2NjrfKnI4KnLWO5nP7231SU6CLeVMU5g9Ukwr+H/tK9OCxz1ZvCsQNiEW8rovccHcvLqGXbMYtjHFUjD59KiKVXjQqMIOyWlAwWF7qnvM8L4NrzuqJzfWjF+7F+gk5d0MH+BHT3jxmMjFR21oEp7BI5K2LTmPJebtma4M/HUnN7cD2IIvOMgVpVPAn1G8wjIjiXqm5f/vZ1fCLa5mfB/11wNkpZW17czzKXU+M+K2TDK+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7+fAGrbz5Zlv/ma+rYOPdweHu3Y3DnsGsQ3zoodr8FI=; b=KFV+c+IyvhvUB2Kqkt8xVR5JEPwRcOjXYuigHsPvijzUvy5sp1Nxd+SEw2r22TF8rjshAmuVmyk36a7Xm2FRRPn/WIwqKl90ZYkgZEE/pcOqdxSAoCAbvHtlmZjoyaHULOQGO819FS89vVvUY6cZ51Bb1XvfKHLr9GdE1VdOjZXqDwz82uEWSNhPr8AV65GpZ3/mymRexRSsTDZGu4Zrr7fDNi9KHJhCPvumhc4+KqVPk0te+oJezX+CJPNno3pNnJDL7aDRLXj5Jj7Fjk3wTnXQU82b+ZcBhVPr+mTrqSCcjr4dm1uURR8BZCD8XzwVWmjABz7sryQfs/jkoLospA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=arm.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7+fAGrbz5Zlv/ma+rYOPdweHu3Y3DnsGsQ3zoodr8FI=; b=s6MAIQiJV5vRkCtujbsbdCW+cCkRsy8XaykNwvWYHsEeHkhuK2xDU+yAiOZBFyIPS9QVf0PcZxl9qlOaGfNOOjiPdnWwDL9HQHpNL3E+sa2Zb7ejuZJCEwpQJhInr364ZB1i5eUjqVlpqwL+IbM9YAf2rgSn6uUTSOG4mUOveu5GX8Fsh0XAoVPtCkzS8aToxLgyuxb6UovA7ogyEdmIvumtAuVoR27gB2oZADpyPiKGwsqksZxLkVxTr6EjstRdSVWsRzJ7GATd3YKLQnOiEQfqYKSAPUWrsvWTN0WZUgEwSJkRJlYDmI2PN1mHfUst7KVzuxvM31XCtUGHO59xdQ== Received: from BN0PR04CA0112.namprd04.prod.outlook.com (2603:10b6:408:ec::27) by SJ2PR12MB8847.namprd12.prod.outlook.com (2603:10b6:a03:546::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.44; Fri, 30 Jun 2023 01:54:37 +0000 Received: from BN8NAM11FT053.eop-nam11.prod.protection.outlook.com (2603:10b6:408:ec::4) by BN0PR04CA0112.outlook.office365.com (2603:10b6:408:ec::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.22 via Frontend Transport; Fri, 30 Jun 2023 01:54:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by BN8NAM11FT053.mail.protection.outlook.com (10.13.177.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.49 via Frontend Transport; Fri, 30 Jun 2023 01:54:36 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Thu, 29 Jun 2023 18:54:28 -0700 Received: from [10.110.48.28] (10.126.230.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Thu, 29 Jun 2023 18:54:27 -0700 Message-ID: Date: Thu, 29 Jun 2023 18:54:27 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings Content-Language: en-US To: Ryan Roberts , Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , "Andrey Konovalov" , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , "Mark Rutland" CC: , , References: <20230622144210.2623299-1-ryan.roberts@arm.com> <20230622144210.2623299-12-ryan.roberts@arm.com> From: John Hubbard In-Reply-To: <20230622144210.2623299-12-ryan.roberts@arm.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.126.230.35] X-ClientProxiedBy: rnnvmail201.nvidia.com (10.129.68.8) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT053:EE_|SJ2PR12MB8847:EE_ X-MS-Office365-Filtering-Correlation-Id: bde1bd45-dccf-4d84-c347-08db790cf54c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: GvXVnDU1RFdjaNpOOuCXaaiNFUcKevti+FNKWOF6IXMvqtjgumhpGhgMLCHk+uR8WpesijmpaMPJ0JjhfjmUS0OkEuVJeQvCgj/GEoEBI7CfTOwq73ZO5yAOfVUptoMQ0IHBwbCK6zZkpsD6FqA7jA2GGmQRyjoU7n29ZQVWAX8Mi1W8EBWr06WB91qCjw4rhl0zgcA4ZVhbMeXTHx0IputAtC8VqXksCwUaWTBCy9bdFlnq3LPwC1SVHY4PsT/vqP9J9I6Sr/m7gPJxBQ0AfZ5ocfIusm3fXjM20FwFCbwKZKGTAjC+Xhy9+9dlG4UxZAUpfcFc+HqWBnPmjd7NzTs2OYnlK8tInG2b1ILSCyLFO2/pWT2gnoO9l0ONC3o9UiIAi2LVEEoMj6bhditPQJQKxFUj+JKX1MQ1oDK+0bcb4PZqyfK51u/RK06kmlB5o6OlHSDLr0TXxIpo5Tvf79KVOtAM5MR1D/hw7MoAV8PU7AEEEvyaXysmOmvI5mKym3kyyFniMX0JYBfceu8+7rmGcsKYWK2YkX15U6LairrabJT4n1Y/hxvo1KzuOyutNYzoYgNns/zBrK/NboQFQ86567M/i32stM5iK2bVViUUj0LZ3les3SRW18dK5p8+ZI9zZJRAC9wWE4Ay4S944a/7XGk1ssV4mR7LkLvziz8zpStX7TbM/XO9hi7+tJ4jP3sx3VeVZ2AUDt+oJe5g/1tzTVIKeVPI4ZrqeIgeO8RkNao/Q0081w3jslhuv/WrMIt/ndtGRpSwW/MhAtrQy5p/VZGZQ6uzQoRxjMoPTxRnwjVWc89rS4IDuVUUBUN1ETTGKgFyj2NbfpWCbLlJvA== X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230028)(4636009)(346002)(376002)(39860400002)(136003)(396003)(451199021)(40470700004)(36840700001)(46966006)(7416002)(5660300002)(54906003)(8676002)(40480700001)(8936002)(66899021)(41300700001)(86362001)(110136005)(31686004)(16576012)(7636003)(356005)(47076005)(4326008)(2616005)(921005)(316002)(40460700003)(70586007)(70206006)(2906002)(83380400001)(31696002)(36756003)(478600001)(82740400003)(36860700001)(82310400005)(426003)(336012)(186003)(53546011)(16526019)(26005)(43740500002);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2023 01:54:36.3043 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bde1bd45-dccf-4d84-c347-08db790cf54c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT053.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR12MB8847 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 77463160015 X-Stat-Signature: ru5jkyrkr7nbcw95z4zdrx56nsu64rny X-HE-Tag: 1688090079-187419 X-HE-Meta: U2FsdGVkX1+xoZ/jSkptVdYQjz6bHIPqLh2facDwAddRKbx8bNcoqEw18q7YJYzP8Tzt5mqjRmS58XzCMahkcUyqdhNPYaYrjaeFS+uw4dLEgF86e1H/jj8r4T2R2ExvtdUeu4niiz63RPvPk+5KKyLuXW5jWrq+gNWwo6nQH05KnCPgJKNSJHrc6BAKvjla9PwsJjVHZd98EYvZGCPbmZ+htw3ksLBohQs0avzYZCanRhY0M/tQb/tqNQOBRtXs+mF07dfo4rRzZHlK7UCngR8UVnjQz4Wi4fTk6qNVZJ16WSuXCchUrMw3TLrzKBsdC0AlkWROQvCT1KfuRgG0XjPjw+cj5mRpT0NXsjMWJsYAIUh3h6ys9HBNhGucua7m2khoeJoocswJTRvguinwThWKajA23VcxybYy2vcm9UaC8YzDpZkKgYxhV3MoJ1vzBlwAhMffBBgSyrsBNrVy5q4NAnnJbUus2bgfnsiflWkoqYJ0F315P8IZJL2LLrPxkazETjesNpF/m7Mhws4rxGUDKMCsBhry9G0ukfCI41Qca3q5NZZ96tn6UEdnZsMNf19ubm+y59quvbo0BNkqm8yKwVrmOK4SGMmGBavLdulOZvMusyJEO/rxnFSzPdIl4BnR8srUdJelbZ0+2a2xa3YOFLvZo3y2gNznKiGyrsm8e4a8ASjCEPXad9SpInmXtSlEpe7wKko6ypx4+PrNHHHNCXKAA5O7RNk8VofX1xMRVvk1L4PjbZXnCzp9woSbcG8nzreBdmZes6WticC8mKLMClC75UiwlVWHoyM8H43K/qvfyc+Pgp9l4h2FloRG3DmOXljU72m3jjBECUUQLK1Uf/ryD8D+7kcijF5zMvr5anGzcXlCUQ282wRf9eeHNndH6yfVG6gnPnKYgNmwwPqKLkcA6ZRZT1ykIMoV9Sp4dt3pq+8KezowANNM1LQz+FcIP3k+gKGc94vRG2F vJ0bq9T6 aW/jQd9agSig16T/D+jXXkEIrIEmBUvCpR/GzVdF5H2Gea/sASVFIOpjASDkvtm9dzTaT6tPG1eOo47bqX2A6E2HtSg3ZTK+zLhCGM/peBGiTwge+XSO2hiMGnIGGpuTk4ZGPzEuVfFF51Am4aOEjOyELXebYeFarGaVaeDJs367ULSDC6k71AFpu5PT6KP6mFqTDTU/88AeacNdlHx0mPNcAmby+Rf5UbO6AE8Qkf6K6W++zO4tLnljRFECXQROWtiVXiTHMzfZ6RGLpyek414uvIIfTEykgPHevF04fmdGR8R8yYhEZFHTXzwTwo6mDZbhKrw28Tyu291omMVi+5CKZ53fOcZZH+63+NObACtc0E37GJaA3SI+R0nBH9QSiEV8tB4qzw8MrGD212o31njwNnqle/H2rL2nDNnrIimQI1IKqGFKI/yDISIkx54IMIDEpJM4AWNJoWE0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/22/23 07:42, Ryan Roberts wrote: > With the ptep API sufficiently refactored, we can now introduce a new > "contpte" API layer, which transparently manages the PTE_CONT bit for > user mappings. Whenever it detects a set of PTEs that meet the > requirements for a contiguous range, the PTEs are re-painted with the > PTE_CONT bit. > > This initial change provides a baseline that can be optimized in future > commits. That said, fold/unfold operations (which imply tlb > invalidation) are avoided where possible with a few tricks for > access/dirty bit management. > > Write-enable and write-protect modifications are likely non-optimal and > likely incure a regression in fork() performance. This will be addressed > separately. > > Signed-off-by: Ryan Roberts > --- Hi Ryan! While trying out the full series from your gitlab features/granule_perf/all branch, I found it necessary to EXPORT a symbol in order to build this. Please see below: ... > + > +pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte) > +{ > + /* > + * Gather access/dirty bits, which may be populated in any of the ptes > + * of the contig range. We are guarranteed to be holding the PTL, so any > + * contiguous range cannot be unfolded or otherwise modified under our > + * feet. > + */ > + > + pte_t pte; > + int i; > + > + ptep = contpte_align_down(ptep); > + > + for (i = 0; i < CONT_PTES; i++, ptep++) { > + pte = __ptep_get(ptep); > + > + /* > + * Deal with the partial contpte_ptep_get_and_clear_full() case, > + * where some of the ptes in the range may be cleared but others > + * are still to do. See contpte_ptep_get_and_clear_full(). > + */ > + if (pte_val(pte) == 0) > + continue; > + > + if (pte_dirty(pte)) > + orig_pte = pte_mkdirty(orig_pte); > + > + if (pte_young(pte)) > + orig_pte = pte_mkyoung(orig_pte); > + } > + > + return orig_pte; > +} Here we need something like this, in order to get it to build in all possible configurations: EXPORT_SYMBOL_GPL(contpte_ptep_get); (and a corresponding "#include linux/export.h" at the top of the file). Because, the static inline functions invoke this routine, above. thanks, -- John Hubbard NVIDIA