From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFA22C54E71 for ; Fri, 22 Mar 2024 16:08:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B7616B0095; Fri, 22 Mar 2024 12:08:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 267506B0096; Fri, 22 Mar 2024 12:08:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 107D56B0098; Fri, 22 Mar 2024 12:08:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 007266B0095 for ; Fri, 22 Mar 2024 12:08:57 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 1B032A1B3D for ; Fri, 22 Mar 2024 16:08:57 +0000 (UTC) X-FDA: 81925158714.20.A1E0CDB Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2068.outbound.protection.outlook.com [40.107.94.68]) by imf26.hostedemail.com (Postfix) with ESMTP id 290BF140016 for ; Fri, 22 Mar 2024 16:08:53 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=LNloP9nK; spf=pass (imf26.hostedemail.com: domain of jgg@nvidia.com designates 40.107.94.68 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711123734; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=to4vDh8hHG4F0VwwxIF5cVd2i4GgYypb3KW+ICTklEk=; b=JAHpgC7qZ8DZnkESbbRb1tl8VdEUZzpMN65sSloFp+k5y1naWgQdCvnsq1nQf3uV73P6Vf NrEdiK90DXq71Emxrf0IZYpWwzaPw+oxBgcEGSulndsi6nuDxGMyjEvkTn8Afx/A3eG/K2 4iqdqzAfCQCkTmpCWrxuhqnFWEa3veg= ARC-Authentication-Results: i=2; imf26.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=LNloP9nK; spf=pass (imf26.hostedemail.com: domain of jgg@nvidia.com designates 40.107.94.68 as permitted sender) smtp.mailfrom=jgg@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1711123734; a=rsa-sha256; cv=pass; b=TfalPMcTqB1bPeXJMSLP9RFJuKp72ImOzd0sxgr5Gjd8s+E1YqPUrs9N2XG8JUz4PSK6cO y2xhfuINMvDxnKZvN9evVgndB9K/rFhWtTqZ6RsGEY1yP/I+DSjUUuREOH+fAMtSusVtqR NsicNPB/+QZIfwYG1Qt/SqkpHw6yurY= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fKuJ0drbZ5V98OAmHlF2UdZg7AK7KAZW7U/s00ueUqs+j5oe4O0vF9gPx0EGn+DgJLHPLceQdPGPRs0zUyexSynDT4X+r1/aikZ+pnpx1lZ7Wfpjl61bpAR41MnVWNqC3z7qbkMdRnC/2T1vs+6eqo4r0aV2iGrSnwifezamA2EcFRp5xmCObJpKBe4oBF7/C9ofMYPKJZ7n+fDIrMte56+fhXpsknAUqDwtku0VGIjio6F2BKkhk+C1oCxq8yeSZYsyIkKb59axloFBiMdI/AmnwVKak7iqTTm0pC64rV1+zWPSOg5BctDPmOgl/UkLEc/lcthgDQ2KG2CXDC8QIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=to4vDh8hHG4F0VwwxIF5cVd2i4GgYypb3KW+ICTklEk=; b=TbnYzIBfhoSVsR3ed/1xolquC42Z73VoGyrQpZ6Rd+UkSKgJHx/CXRFr4ltZYpySngvU8pk17Z1+ScJDK/a8psmDwEoGzP4q+eGr9e+OBOn9KrpLu0XqYeR10C81y/s8hlGqdNTy+NlDPMViLBnyF7VVc8h3aed9jtXb0S2K1kcj6QFZbS5boOQILihLnIgJfpeqTCZoO8Sny6FsakZYesCD734l1xABKT6VccthziOfeew3pX3KD/XoLFk1jg5BViNep3jbWjWQq//J4y/0GF2nWIcPbNf9Q7eQmJxyZ+RhxDhWRNQI/NKvf6gjd19bEA30uEB3+NdDrwzLVwUFrA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=to4vDh8hHG4F0VwwxIF5cVd2i4GgYypb3KW+ICTklEk=; b=LNloP9nKOyT2udTa3boprHWeJMHNLEt4qDHMGsMVB6rq5OSwjRrQweomfl4ZvpKvFsfHXKBm7+3Cn9qE9W41hlN8pQe0ax6UcOZvURmfy2/XVaaqLcZ3ZBdxMaSpigGQ0jYEVcSHcnyCicHUyZsa63iCLrUmpg2PHJyWQiG9hPaGmHK75AOUZpxuSmsIbDzDP3TYqYjY8o/RdKX/4QdoXwn9x5gTVbD4oE2ywBrwigPeSYcoQILoF6BtF34s7H+l1WNXeLhjwS3JP/QzBUWzAnsAyLKC7HGQzbq8CUVgnCn20ECGfGLkIu3ePY8CtaVw04IUPhURTVD19CZdhgCd7g== Received: from DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) by SA3PR12MB9227.namprd12.prod.outlook.com (2603:10b6:806:398::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.24; Fri, 22 Mar 2024 16:08:49 +0000 Received: from DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222]) by DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::6aec:dbca:a593:a222%5]) with mapi id 15.20.7386.030; Fri, 22 Mar 2024 16:08:48 +0000 Date: Fri, 22 Mar 2024 13:08:47 -0300 From: Jason Gunthorpe To: Peter Xu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, Michael Ellerman , Christophe Leroy , Matthew Wilcox , Rik van Riel , Lorenzo Stoakes , Axel Rasmussen , Yang Shi , John Hubbard , linux-arm-kernel@lists.infradead.org, "Kirill A . Shutemov" , Andrew Jones , Vlastimil Babka , Mike Rapoport , Andrew Morton , Muchun Song , Christoph Hellwig , linux-riscv@lists.infradead.org, James Houghton , David Hildenbrand , Andrea Arcangeli , "Aneesh Kumar K . V" , Mike Kravetz Subject: Re: [PATCH v3 12/12] mm/gup: Handle hugetlb in the generic follow_page_mask code Message-ID: <20240322160847.GA2924038@nvidia.com> References: <20240321220802.679544-1-peterx@redhat.com> <20240321220802.679544-13-peterx@redhat.com> <20240322133012.GI159172@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: BLAPR03CA0134.namprd03.prod.outlook.com (2603:10b6:208:32e::19) To DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB3849:EE_|SA3PR12MB9227:EE_ X-MS-Office365-Filtering-Correlation-Id: b05142a0-7586-45b3-fd44-08dc4a8a5ba9 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4bJqTf9l4w3olSzf9klt/B+H+Xfg82jIQ8w/61krpSMitw4yO79IeUAglya9523kyJt25la6PXVTADXxL4oOjecRS6uzJ7P6XcwWZ4mi4Eo32BdP5HP30rIOE2n6sEeATppBQUGnzOOZhHdNJNRyz0kTgD7IABo6aHqNkKpm515HITFhPgEqc8YFhnu82mOd2In1CaYBN+kiGlabpEaRaf23QSFbIy0A/dS1mOL27e2i1c8S00somKG60yb2x3Axm9h4u637oWkDcngOdHR9bU3l6VKZMTKn5Fk5WkXr9sS7ZlP8Ico866wrvcxwtW19HgG0XTWSz+BgDM1gqQSz0iZoufMm9i05m28rq8xHg/+IAkMt7fyk1955x4MOdhDrOXOP+gBODTVHBrpsEf6ITSm1Dq4OMYmD1gORHrf2ibzazV8AwcuGd+UsL9PC6Z/tr7AGje99T4ntq90/ygTZ049ES4pmS8bjPishmy8DPgGrIl24youpA0NXPZxjNpQgoQRMnrFYV5lrZmhQlkMnJPOhlxs7nhCVlXGBH94P4XoRHcYwq4d63JgfkqOK764dpb3oH9WSPG/rL7zuJnlYxlaUPb3Xc0Qll1yG6xjLljDpVP/hWy4g6ltP6jF5l2aEMdM/yucn2W/V2kPrK/WRIYi5QWCNa4XjTP7p62EV2D8= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB3849.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(1800799015)(366007)(376005)(7416005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?lDB+CmGubtSge/esh8S0fKtCmtmw3VR5RC35UGipC2foWHUx67PpofwtAuOC?= =?us-ascii?Q?fu2ef856v/0O/qTIHvZWsVm5jXrTqJq44SC/YosnJomnq4UtvCIwSs4/8ToH?= =?us-ascii?Q?QH6qe0uVJircXdcR0EJ6i2+6tuK8PLyP+ZeSH1iJqvQvTFqjLDGCc/pEhSvr?= =?us-ascii?Q?EU36HGooh9KPBRDfHGhs08X8x7lzAzQ5TmbDOtIr4ymfTh397oA65nmPdOsl?= =?us-ascii?Q?BxEqLQ3U6IafDYxMyHNbtEHxR9P2oqa2YbaLymp6NAKXh1j+LPqGO2y0D4ix?= =?us-ascii?Q?6mRlerky9szFOf5bQHgacsVyOAe9LDvsnbrs2X34pCovc/3nDijlPj+aB0oC?= =?us-ascii?Q?TylvZjF4MzNDY21j+5YPtpmass8D0OiPUvTAHAqHFzjF9S07PWxZ2Ha7C4zQ?= =?us-ascii?Q?Xp7CKuzfaoAfSTlOEn2AyLffOUTJylZvimjzxK+AOxaECPkDOsOMgJ2/lPy5?= =?us-ascii?Q?HqSfbaxc6P5REGaGG7a0LhnUZ6n6PxRyv9EuD0NdSQyinBGVksZ5aLrJ+hvb?= =?us-ascii?Q?3De2RLk3od4NIOvAhQgGQ8Gmdg2i1EBs68NGDepsglgMSmlfFi8Ukr22qMI/?= =?us-ascii?Q?gAlkW0ekbxFiThSjeXlPaoUbn0fFHtj2CTuUH7xFGd0oVX2gj2lR6dThY+9H?= =?us-ascii?Q?OuxOtC5qY5vuIF4Av6egJnYihkos4vOUCoegDxNTs/M4/mhvjCf2jN10htxd?= =?us-ascii?Q?18LauCnS/p8IE8przQcnM9Il5lN+eohXiod0YoFztG+szdW2Q3blO4V86/Ok?= =?us-ascii?Q?0KlkFxwW33HgHuGLJ17CI9lNdrRFRbi/E8o0zHYUpJ2KH5trKz+1NiZQUR+7?= =?us-ascii?Q?ylQUY50lPCSjKC8/A55XlmBMr6jo6BQghoKl9L9+rJrIoEjuERuWmKbDkuiY?= =?us-ascii?Q?TA0btI13wgcinMNSuSbGEL8Yv/MYKIr4sEtargovZDTQcFJg5IyBIyUfTXTB?= =?us-ascii?Q?m6QDkJRGO96JwfjuNQDkPbv7HxKC28JaR2AL33N2Fo1a7CszdpR3LcFKw1VB?= =?us-ascii?Q?9hSZPiL5F+QRazwFqBeERm+aU+/q2pfPVmIw5UzvyVm+ql/5/yFfxS0PxPH/?= =?us-ascii?Q?bykrJRdRr5oUJs6kDSr7xJQYjmqt5GVzU5gx9DgZ1/lRplW7hmuq78KSk4pa?= =?us-ascii?Q?zJhV68KYnO31+mwaZGthNwsWSjEo048kJXLZmxLb5i7bcWdqBMK6Ef91vZvJ?= =?us-ascii?Q?+tYgrlv8aBaD09l74S2QiYr86KX47/7GmP4I+3X0VEdR75LLCcUnGMfU4pSE?= =?us-ascii?Q?a82I73E91p1IRkrapmV3BfddOLNuKm3chmT480LMrvz9+/TIYk0aKkxbACnK?= =?us-ascii?Q?oOCyLl6MJWYYWIlAIew19IHYQgxwxw7/vaGi9D6A+ztcJ19eaJr1wP7AgMr3?= =?us-ascii?Q?bN9E4NdV21w5QfOpx40NZeMEZ4xdu/iuucDeNuPSLfWGcjtGTEmHSu8biPHF?= =?us-ascii?Q?lYaPm2EQZBOERavzTVvJCJTfWu5M/tC8rHMUc56cLWrqF0IngMr7yraHsv9E?= =?us-ascii?Q?TGDqzULLoj3WGO8heA+X/nw4u2xx2Yqlh2fAaX1YoIyWPyjQuvMhfCtceLlL?= =?us-ascii?Q?FUqIws04rLfjPKj2yEJ3QWNCM2eytNmltKL0f+nI?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b05142a0-7586-45b3-fd44-08dc4a8a5ba9 X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3849.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Mar 2024 16:08:48.4980 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: duTUVj9cQmqFePdlBnw3+QZgP0dbqux/FBaU1J8Xnpw14pKpFQMMlDaox40QKUTu X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB9227 X-Rspamd-Queue-Id: 290BF140016 X-Rspam-User: X-Stat-Signature: fhd8sd3ktw7z65ynj8gqmog6bck1eiqq X-Rspamd-Server: rspam01 X-HE-Tag: 1711123733-243989 X-HE-Meta: U2FsdGVkX181l1OpLbWWUjldsVp7o6kUPn//u7Cz0ECSHaLxH6qSWumD3NfXYfMS4JpHFp9RaNxSBRHox4LtmI3freH8DZSiU8BZkFgLueL1GQl808H5pTIBGUB/rSVKQ4DTLVGfgRuESLMTOS512w9MZlPDNdFCnZQxcmYr2YCVJmHFIqDDKd6yYErsD3T/PA1nuezbDjT51jqxYrur5tAvbzibFWZRRbsMu7FkG1z5ao3noO1eHaylbyGQvRsDze0bZmZ32aQNsn5UBRKSWozlnQ19LmYeSfPaU2obtgFA9KuBUhYvySUM6OBkcCFkL2sepzi6M3qTb/KoMQNdls98WdB4UuFpR5tvX5UTmsNnMtuCDVQoibqUWgo/rzrYMRj9EZYIKP3im9SDBuBpSN9wJk/9zVul4KcnamzwZuvW/cUSe2aoKhSnV0xkiPn8gK4KJynvpGocVLDDbUsnacbb0ihtZJ7XwKfSPUDNRFkavWQY9zL96t6cxksUWWqqVdv9zbsjyHvVd0mZr/xZgLNjETMCg5UqRIx/N7l5lE7+y/uoXkF2Ka2tXNnZi5l4GXVd1VLxKObUHGPuygmMee//eDqugFBs25LgndOEDeJiy31I2Cldbfh/LVMxYBKbpZ2R7sU7v9v3VNca7asdE2GS18GN2eQQBE0a+1nzN3XURnVkDNIVO2N4tOXE9NWOagNtDcFHQEI2V3f3nzENQWSW9cfjx8detLlhzggH65aNvBk3o2MSmwfFOJlRR3FUkPaP8vy96ol9clVCsI589WcGXrpRcjDrgknl4vUvMXHfbq0znaX1u5lGzwzWG1RmzJuDjgf/hlf/mdiPeHjiK4IoyWny8ECXl0Og00cHRLcWTextqhDBYNf3riwi3kYoAAH4JTFDW84HrIone5KrGQH49hKpUj/qCul3ewkzfzWN3+fsIDhrXv0pOOy9U467Zq6ILnW2aG66rhlVVdT 8sIe/0Tu 4rcGcwNhXdXgic40etLh1+4Gx1hE0T6omPNg/yUUYX/++IMlbiYhrf484kxByXZjQ3ERLzq/kk4DqN8g91a6dzZbH95vFLFvReiLpS7qlfyNIcw2oTBGr4HAmyEi/r4qd6okxN6xGo+bh29cAhStDdECzDOqeJ3a0V7M3da/U/0RABY4m4huhA8lhfaC2QgABaVVT0TWpbPXmbQFZ5v4Imzci27Ho+zwZCn1RS5H3GrEIiPFGkWBRBubK6yXcf7YsEjkLBWOSVZP9QgFuOFK6mxTZciEc4oEy7AX9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 22, 2024 at 11:55:11AM -0400, Peter Xu wrote: > Jason, > > On Fri, Mar 22, 2024 at 10:30:12AM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 21, 2024 at 06:08:02PM -0400, peterx@redhat.com wrote: > > > > > A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over > > > a tight loop of slow gup after the path switched. That shouldn't be a > > > problem because slow-gup should not be a hot path for GUP in general: when > > > page is commonly present, fast-gup will already succeed, while when the > > > page is indeed missing and require a follow up page fault, the slow gup > > > degrade will probably buried in the fault paths anyway. It also explains > > > why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup: > > > accelerate thp gup even for "pages != NULL"") lands, the latter not part of > > > a performance analysis but a side benefit. If the performance will be a > > > concern, we can consider handle CONT_PTE in follow_page(). > > > > I think this is probably fine for the moment, at least for this > > series, as CONT_PTE is still very new. > > > > But it will need to be optimized. "slow" GUP is the only GUP that is > > used by FOLL_LONGTERM and it still needs to be optimized because you > > can't assume a FOLL_LONGTERM user will be hitting the really slow > > fault path. There are enough important cases where it is just reading > > already populted page tables, and these days, often with large folios. > > Ah, I thought FOLL_LONGTERM should work in most cases for fast-gup, > especially for hugetlb, but maybe I missed something? Ah, no this is my bad memory, there was a time where that was true, but it is not the case now. Oh, it is a really bad memory because it seems I removed parts of it :) > I do see that devmap skips fast-gup for LONGTERM, we also have that > writeback issue but none of those that I can find applies to > hugetlb. This might be a problem indeed if we have hugetlb cont_pte > pages that will constantly fallback to slow gup. Right, DAX would be the main use case I can think of. Today the intersection of DAX and contig PTE is non-existant so lets not worry. > OTOH, I also agree with you that such batching would be nice to have for > slow-gup, likely devmap or many fs (exclude shmem/hugetlb) file mappings > can at least benefit from it due to above. But then that'll be a more > generic issue to solve, IOW, we still don't do that for !hugetlb cont_pte > large folios, before or after this series. Right, improving contig pte is going to be a process and eventually it will make sense to optimize this regardless of hugetlbfs Jason