From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD358C4345F for ; Wed, 1 May 2024 12:10:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BC3C6B0083; Wed, 1 May 2024 08:10:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 46C896B0096; Wed, 1 May 2024 08:10:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E5D36B0098; Wed, 1 May 2024 08:10:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 11DA86B0083 for ; Wed, 1 May 2024 08:10:43 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4B7801C0E41 for ; Wed, 1 May 2024 12:10:42 +0000 (UTC) X-FDA: 82069710324.08.216B772 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2047.outbound.protection.outlook.com [40.107.93.47]) by imf04.hostedemail.com (Postfix) with ESMTP id 6683040016 for ; Wed, 1 May 2024 12:10:39 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=ZAZF0PbY; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf04.hostedemail.com: domain of jgg@nvidia.com designates 40.107.93.47 as permitted sender) smtp.mailfrom=jgg@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714565439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZnJkJO7orUqyoqrS/FjEYW9dopHDcxHV/lfxRSrGFSs=; b=xtrcmwTOmk9xwsP1r/pqPd+Y4gzvulm3Y8BB4lvWo1abqpds2fQF8zWfxViodbDb7ncrbA sE9uBqCPDDOb99oTrLTLhHYjpWbuQ+ApS4wa2XfuIWg0sWJME0fmqt26Rse2U7npGkW+Ko VEcDzs4ZYub1Bt9bcJLE6xRvO3yrxnE= ARC-Authentication-Results: i=2; imf04.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=ZAZF0PbY; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf04.hostedemail.com: domain of jgg@nvidia.com designates 40.107.93.47 as permitted sender) smtp.mailfrom=jgg@nvidia.com; dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1714565439; a=rsa-sha256; cv=pass; b=4PX8dMT46uMicHXOmr9pmm6DCjGPgxNv1g1fSpoZDgXFufZb03aMljsCVzwyPHHp1HTgJh 0X6DJAMCYacLTeWpeuYC6hHFzLje2xkb9vLk4qPIxYaGNzXJhfF9A8ODOIqUWju7J1uWOS nNNlYBId2f+Mh1sV1SKfJEgMPUKMyug= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ns2qT7Yx3IClk6eq+mWQfjg/lQvVxQF/Hevhy02QZcZw6Uosb+lJjy8H6ARl7k+WZoMJT4n/gt71fdQVz9JgTN/dYjZdjxEsIHZKdihSlhvic7/9PCHHPAmGEdqj2R74EzMvDzDX7kCnhM3bhTPnmhgY02SxAG+F8B7DQPaEz3tUGDYqBQcK6fPrCNUDiP48nXD8Nn9aDFKWerOEBC4fJ7j0vHB9Sf0TXeyf0f+ej6QGnOY/GpmTCi5lfii+JmFVgXG5VA+yKEqJrQZ/sQb2bR9x0+70muxD6yBibZkmIANftnI7KCfA3FucV1InX2WJv0GRLAIa8gU+BgulO558Wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZnJkJO7orUqyoqrS/FjEYW9dopHDcxHV/lfxRSrGFSs=; b=LndbW8qIkZhXBXrAIp/kJ3xBVkfLEbuKV3gVyG1AZIhsx7IgELjsK5ehIjkaJxoITsqPjQ2e9UKHkv2Gor61mApFKKSL616jjZk3VqmFjUKh+UsXUGdzesUsxCQh/jqkI7M763+JXV/JJzhjN+skNFR49tD48UT5jJ+JIGsTGULfzUYEMINKi74wcz7P2YE4lCNpOJbB1K/V1YX3k+CewSgGBapoB7Qk8UIhTAEsm40s28ZwnPosqS59HTAWocnuYMKhMSexuGuxCGRoEHGGb0wBR0jck6mw4+qJyOfKD54l7BhlIiEE9Xu0BnzVoZPIR6Sb3TcSDvV6Mk8O7fhFdg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZnJkJO7orUqyoqrS/FjEYW9dopHDcxHV/lfxRSrGFSs=; b=ZAZF0PbYpAukV01czbfEiKIvGruRHvOrrO3Lx4u8/akmNIfMbKZtGWtEC5NR+iB+8NuiJV316fIFsMnonRAs0iMlwwjY+Nhmgv9BZ9F0rf3e8l2tpX1UmNWxcvXuINZvfco7i1PrvA1wKgMawcMH/OOBgCMn1GDwGk6Ikq3TuGguBkynAtqiIdNZ3JdUc/h5FzsDI/2tMhUm5EqEV4Ue0eiMe5bEF9hBTG7OSBTSE3AovlOh6tPrceWOBfmyXCtkb++vRsc9j+HoZNoL7o+kN852Cz/ChSYsfwff2UAZLp1MXVH7ofMKaCZvqagBnQaszfhcYWwY5LGrC3FB5LjX0g== Received: from DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) by PH7PR12MB7283.namprd12.prod.outlook.com (2603:10b6:510:20a::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7519.35; Wed, 1 May 2024 12:10:35 +0000 Received: from DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::c296:774b:a5fc:965e]) by DM6PR12MB3849.namprd12.prod.outlook.com ([fe80::c296:774b:a5fc:965e%3]) with mapi id 15.20.7544.029; Wed, 1 May 2024 12:10:34 +0000 Date: Wed, 1 May 2024 09:10:32 -0300 From: Jason Gunthorpe To: Christoph Hellwig Cc: John Hubbard , Andrew Morton , LKML , linux-rdma@vger.kernel.org, linux-mm@kvack.org, Mike Marciniszyn , Leon Romanovsky , Artemy Kovalyov , Michael Guralnik , Alistair Popple , Pak Markthub Subject: Re: [RFC] RDMA/umem: pin_user_pages*() can temporarily fail due to migration glitches Message-ID: <20240501121032.GA941030@nvidia.com> References: <20240501003117.257735-1-jhubbard@nvidia.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MN2PR13CA0009.namprd13.prod.outlook.com (2603:10b6:208:160::22) To DM6PR12MB3849.namprd12.prod.outlook.com (2603:10b6:5:1c7::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR12MB3849:EE_|PH7PR12MB7283:EE_ X-MS-Office365-Filtering-Correlation-Id: ad664f03-e733-4e47-5f08-08dc69d7b41a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230031|376005|1800799015|366007; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?mtAEUFuSCuTQwbFH3joNXrkHVodfkMxG796g80mnGMT+bGnSyeCYb1sQZpX+?= =?us-ascii?Q?ch5xNtgdwjLxocicSl2zJzXbiIXqqI3bI0O7FV3jDo792onPjbiq3lFWmIE0?= =?us-ascii?Q?D25dJ9yoj3sLUCFE1hJ5F0MLTXQEXDnEUHfrFEy+V+qQ8QGF5SYQWHmy2qZz?= =?us-ascii?Q?YmP+HRS+Qxcfdqa1iGF7uFz+8OEaRpHGHJgqZ4e/EN4Z9qQxVU013qacHj8e?= =?us-ascii?Q?8SJV1jv25sqOHf8ng9WgdiKWMHHp8MIdhBxkcbZF5Sl+nUOVyEKDbCDtMcla?= =?us-ascii?Q?j/RDcgUE9A+VrnEnS9tFYysosiphmLKYPX6cNDe+014JLmEvqP5DyyXMH/ZL?= =?us-ascii?Q?gqvo2L4sHHHRxj/ltYGxrvwHolv8sQ3dgJYwsGeSK5a8qu7717sM6Xn6zvWh?= =?us-ascii?Q?q4350MAr/avYTxWExeL3jAWgktEQCEP+O4EyOJK0KYsXbcGwBT2Lc0LhHaK1?= =?us-ascii?Q?XOIyl8HFoDesW6FXthGbnls2fdzG+Hqil5kNjeA5v51XssDMUczA3lcvHd7r?= =?us-ascii?Q?UkUegogw+ebV8FAXU2OggKCmgoQ6WL9g+80WD0MRDIMpX/d+Nr1S/xGMK79o?= =?us-ascii?Q?VmZsP5a0P8KC56fKP1ooj9rFmneaJH2gTNKSoghrDNjIiMYk7i/lUbTNRqtm?= =?us-ascii?Q?9HQdttCVdGlDmP7mh8HeYvMyTR36Qubk3a616xD2SOgUlc/+bRY7h7gayPIi?= =?us-ascii?Q?Deq/iTPc1P7CC+q1U5xtQgygAOZcQW3pMNjarAo0OISuJ0ly2rkHQrBRLrFB?= =?us-ascii?Q?0wYZ4JqyspGsCeg0kDggf1bfngsO6fenD/lGYcaEzBTQKMvfS7ddR1F01xZN?= =?us-ascii?Q?BP546fDvaYbyC8hAQdi8IbvfNtMEyuLV4+FEzc7mQp7cEpmxJn3TguMoDfkk?= =?us-ascii?Q?F+khkqxqB+6jLz4TpjtL1ZomBbZxctmcOaLLvK2dFrjhejCSYh7tMVwBj6eM?= =?us-ascii?Q?Rr4pcrk40drJYlUMHY+nSCQQPb5QFfgBVzVnDNREpPallzgNnPBcqrmI9BtW?= =?us-ascii?Q?+c/ipxB2xTPkSvO6Fm+G6hpF+Sw7hpV+SayvcljWDQ3dYZ1NNj2GGGhltMiP?= =?us-ascii?Q?M7vscSu5DQYkOrSJXuZ1nsHjvtOesFs7A6Lh9VF+A8c5nkjIGrcgc0Kz2KkE?= =?us-ascii?Q?72Kovc0ZpKTOjPYE9UTCy8Tn8GcefIWdZgjiJoEGhIXoAMXbyQwbH8Ex7zXF?= =?us-ascii?Q?U34bYqdNo8/x21J9tlCXTkGDk7IyA605dGY2hwI/iBOqOwNJ0wPmqgOaRwRY?= =?us-ascii?Q?W7sjTD/lCn7J8gjF+c/akSOF8q3UWHwuRzEGcpQoMA=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR12MB3849.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376005)(1800799015)(366007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?mDcEyuKOVjIAERu9Qxo8GmTEsJYBGtc/cDf06KkCC9JpTmGG7XgM97VH6zbs?= =?us-ascii?Q?Mwk39pjsAL5kZXOXp0KHliu4HEyP+EoO2iAvmedD4ezpp9w5Dk4Rn62MxvBE?= =?us-ascii?Q?l91uICDVp0gQpHPgFnC7/+LYUzRGG7PBEQAQINqPADSruQ3DkMLSABrNsWCF?= =?us-ascii?Q?xwVsF8ERED593Q0u4qyTauh0ufA/juSMFLr6kqkFAjUSdXNoeu9Jhaqsm3Lr?= =?us-ascii?Q?75ZGVmDH5hCG6IFp3xQDwHqMRL9m+DIOUSbs17ETfI+jpP5c27lXXFnjyolQ?= =?us-ascii?Q?YsxLHeR2iWddh/8nlNJ3DNBPbPv+7BDuc/t70fHkJXXs0pNlF+9G4VfWnfoL?= =?us-ascii?Q?TpD1ILLXE26OyyG7SSoXxIm4NgsdtdqYGnUkhj68WYQ55EviivG6ZOcee79H?= =?us-ascii?Q?ktDxippODFe+kx2ItaWX6qIjLOQkRG/FIEnNhulwu24kH7wWMc8MgvK60Xdf?= =?us-ascii?Q?QjFqdMYB/dwsV4eDw4SVJz3I7vZLZoEBVl8elJg04BEcFXAO+/g89EtlPgEM?= =?us-ascii?Q?beFzmw9vvwXstXyVCcDN3gBiZ9iS9GlJ7CPO15d+6MAWguhhvu+9bVf+Ck5j?= =?us-ascii?Q?y1N1P5GesSE8VBhCVb7cAqWzwJCleMO/SxpkakI+ehdTiwVb8rAykPF4/ARz?= =?us-ascii?Q?vMbVeVhAX6Beg44/WsMKUJZQbxHy7aWf6OoioYeSe/hwNx64ATlbIUwAzelt?= =?us-ascii?Q?XnRAuWQset5nHkwAlqxg30W9K4E0WdvDKZTo7DdQAiJYfulYBbP9nLsKYRXf?= =?us-ascii?Q?GwY8lY664YB9xthz8022WXrslvU1LxmbT5PBBCcG6BgU+qSIyhJqXNkrOROS?= =?us-ascii?Q?ZY5NwY3mSIVaDAUgecf7ABRZVyzjjiHKDAGhAjW9P9IGYqrX7hv/olIRlk1B?= =?us-ascii?Q?bgMvaxUPeFl9Luhf5yegVzCflux+VXhjW0o3rMXTMN6/WKB8ESUQZCgstRGR?= =?us-ascii?Q?nIsu768KS4msd++EAxON4G1ApWB/SC0GPtqn9hcS/JCfCWIKvf3PkwmgGtQO?= =?us-ascii?Q?h8gnpQ90b4hQpW2v4STeCF7483yA284rB6qJ+BSGJnspUgq+NtSl4bCBbJp5?= =?us-ascii?Q?uqpTfmOgD+Hbgd2C9L4RJ7kimUyOd+r7R43b6/4GWbno3aGVZzO4aVgrjGxj?= =?us-ascii?Q?sZCaBvgjnq2D3q0kmOxFROfL/3bOx1lBDt5XOKgXkeCp4u176crnGQOSpZmW?= =?us-ascii?Q?LaGcUcWxkxeEGDr29nLG1hLP6YDgcYVu+kxz0KbCJkZyFBfYtSx3uFHhaYzB?= =?us-ascii?Q?1iMLHEh/rMTYLMsInXT96zhKpKSj7qCiC8PVh0K2qq/ZE2ce/73GPocnwerM?= =?us-ascii?Q?KcFB7uQlIzPwO6WWk5msr0gRcs+qr3N4ZdZ+cuDJb7bdcVVMdJiEed3cY+hY?= =?us-ascii?Q?H3V9bYWYI5rHW9wthAtkrdNhB5EZ0ZMNAEn2isx4oVWB/5qqojrZJfVd/tOm?= =?us-ascii?Q?VjeihMQxjGRCZHV7MnTqxzyYqyMLwDXlDZE6FtrsUcxIu2ssfixSMUg0rC1U?= =?us-ascii?Q?OEkhnmLem71wigsApUvya33kFIlUyvs84c/PZt51Zn0PPQVov2GTzdE0div1?= =?us-ascii?Q?TTJqaYE65A5AzQGEnEIclox+czBaXFQEKI9aMeR+?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad664f03-e733-4e47-5f08-08dc69d7b41a X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3849.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 May 2024 12:10:34.1953 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: V+r7PVpf1xn4kawtTN5uNQ+nybR9mNnPhUukwuhUcVqjbZxhSU+Ipm0bS4SPBbon X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7283 X-Stat-Signature: 15p93be6faweesr3tpntxe59mkin549b X-Rspamd-Queue-Id: 6683040016 X-Rspamd-Server: rspam10 X-Rspam-User: X-HE-Tag: 1714565439-518162 X-HE-Meta: U2FsdGVkX1+wFb30MFiaaWNqUtoR6AY2lgKsJAw9lADAHAmvVjdZbOx3oQwsFgsZntshs1SZG43miS1OQ5k7KLhoRiUM07rZtv10+r7/+coHoBuf7CWNEGIwsTyBruY6SVXBzcwYXCAq7ndhq1Uw0y+YuKeiZvYeejHbLUbIIhMSCw1kqntmOhGT8B/1VVV5os2LvOVspS74qVBDByhxYQ1mLnhJN34xZOxYNqjYaazSO+g9F+bmou151qElLNGqJUHqqdd2mZt/3xYRilGBZq2hwAHpcbw5ima/X4q1Oj1gg+vmHgPynaDBRqzMZ+sS/WZ2TMnsipxKIqHilrV88H4tlrGLCg06viXrI6RGVey3+3KhNKEFuG5E2M1xCDXMtoYEUUjMTF3lHG03081Q4vYPSR2U9cgFwdcabK1OAM/ajfIFw+w9r0BfE8Y241rhbjNugGfkrNDv3xGMT9dmPWVamKW9NfQ3fANSGUzY63LRw7QX7uSS821vAFVOrfHjzp9t5eTQbTt/zPowqHpiZchUlrcK0SXrYXbSvS3JE+7eF8GWSmB5LGTBiCxc6b9+FwJMS0oTtQc+Ax4ZR8/cnIERrY9lV0hbS+y7yUX2o6elkvgHsT+oEEnKdiNFL+rqcU64N8oO1Q1+0oBtkE4TDv7cSOr6hk+tnY47VOcLonE6sPEDkTTbYEIBdsaieTQmhgQocQuc0M6alBtpEndgb+xJbNRp5Ky0xtMcyJ83KxOBFrWyyH5X07bpUofmuRysOYCPnKMrnRXlQLll2gxW/aM4WYfThv4+5EFmmLs0hbwQqw6py66BG2Qlrh42itUyUx7Mb2wivnthw71fx7IpskwQKJsl4xu59Tj2HLuriljX95mQEeK+1YKlXTVIPYd1pwOkSM4Z/diVlSS5s0RF3uZU1OoX4oXgZfs3kdFXThFhollzHvjy2QZ0kKFE8Y4FJ92Vw9cplEvh39BzJCP gAmKDTKE C6cW3gpqEc+uXsGFUoGB7Gg3oBq0lvBa2Nb9a5GhH8GldP5kL7KfHwXi9A9dzD9scah8TADgbm8hAz/wKJvp/RFRodf2pVNaU9uRHlAkDkHD6cUmiVFZo01r8NZn4mU4B8Q8vwpXWLgLRX3ouGJYBGVZU8TfbJRwtu0iN6Gg83VAd6Aazxdqnnz0QCAz+ZHwHtoI4NbpudHge4FI0q6OlSojsMiX4DgPTam9XOlOCbMSUJUzOUBb2sgl+yspX5CltUIbM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 30, 2024 at 10:10:43PM -0700, Christoph Hellwig wrote: > > + pinned = -ENOMEM; > > + int attempts = 0; > > + /* > > + * pin_user_pages_fast() can return -EAGAIN, due to falling back > > + * to gup-slow and then failing to migrate pages out of > > + * ZONE_MOVABLE due to a transient elevated page refcount. > > + * > > + * One retry is enough to avoid this problem, so far, but let's > > + * use a slightly higher retry count just in case even larger > > + * systems have a longer-lasting transient refcount problem. > > + * > > + */ > > + static const int MAX_ATTEMPTS = 3; > > + > > + while (pinned == -EAGAIN && attempts < MAX_ATTEMPTS) { > > + pinned = pin_user_pages_fast(cur_base, > > + min_t(unsigned long, > > + npages, PAGE_SIZE / > > + sizeof(struct page *)), > > + gup_flags, page_list); > > ret = pinned; > > - goto umem_release; > > + attempts++; > > + > > + if (pinned == -EAGAIN) > > + continue; > > } > > + if (pinned < 0) > > + goto umem_release; > > This doesn't make sense. IFF a blind retry is all that is needed it > should be done in the core functionality. I fear it's not that easy, > though. +1 This migration retry weirdness is a GUP issue, it needs to be solved in the mm not exposed to every pin_user_pages caller. If it turns out ZONE_MOVEABLE pages can't actually be reliably moved then it is pretty broken.. Jason