From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20B19C4345F for ; Fri, 12 Apr 2024 22:59:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A40356B0089; Fri, 12 Apr 2024 18:59:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EF976B008A; Fri, 12 Apr 2024 18:59:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8692B6B008C; Fri, 12 Apr 2024 18:59:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5EE4C6B0089 for ; Fri, 12 Apr 2024 18:59:21 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id EF3A116102D for ; Fri, 12 Apr 2024 22:59:20 +0000 (UTC) X-FDA: 82002397680.25.7724217 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2046.outbound.protection.outlook.com [40.107.94.46]) by imf12.hostedemail.com (Postfix) with ESMTP id 09FE440002 for ; Fri, 12 Apr 2024 22:59:17 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=meX24XYz; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf12.hostedemail.com: domain of ziy@nvidia.com designates 40.107.94.46 as permitted sender) smtp.mailfrom=ziy@nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712962758; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cypcMMc7DT7ielj1Ua/mQbk7/usgY1aALztnXnFDX8k=; b=x289cuiElvUGa9F9i8TsoQuuJwrES4RlfCYjTARzCwy3QBS3fdliANZws2a/eyy+TN1MTG gBhL78tIH6K8IvwT1jkBC9bRMbjl97v6/m9dj59bx4O9IJuToaDZu16uQwVHU2SIvuNfpu rPBR7rFWVInOGaKCmBJDmPTlf2jNYwI= ARC-Authentication-Results: i=2; imf12.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=meX24XYz; dmarc=pass (policy=reject) header.from=nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf12.hostedemail.com: domain of ziy@nvidia.com designates 40.107.94.46 as permitted sender) smtp.mailfrom=ziy@nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1712962758; a=rsa-sha256; cv=pass; b=z45C2xQ9q2rDCXWJMYRYFAgtP1kvzI1ddxI7mkiAzVJtAf/NswOvpvbHV+WStAxqtayudo F/wd0q+I40imZPKNHabb+KL4hhpSg4jpGJ1oJxZA1EB7weEvygkZxDV2GYz5QCZoe4NBEz WE29aM+PURF1fS/1+MT+nboAnur0qLc= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mopmNbWa5P7NQ97doHovjTfzzhqAP2jpV1cbdm8Tf+/eUWJgtErYkMTrfPYCm4ZsG2rKYgfv8HrGJCKCeBdnx0yvn7VvbQG6taBc38977skrkYZOI7jBS8HMhOf3qWVlxLuPqmnb+A6/3wOTB/mSv6sGRHuenzz+PSpMhgHmYxH1FJQH03BIN61QbICxjS4j2K1FjbCtq33XgZWz/VsnAUaX0mYUt/H9tSeuGtKrZPY5qTCfpWBrUd9H+5Zt/2IxtY5orUX1gAVpyWR0+sUFlBTH0HvcGbFLEsWXk/7I7CALTm2aIq0dzU2FldUiSPPSc6Y5yapghGIKOEiLJkOwfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cypcMMc7DT7ielj1Ua/mQbk7/usgY1aALztnXnFDX8k=; b=ESfI18SzYx8J3ZW0HTnLisgOj/dtvGTruQHh23CryIGozh87qLX4qA6gKrcVtSCZBL2oTZz7LWOOKgveOWbJ24vPW5R0N0e9vpadeHItit0fGrY2bnN//T0z2x+FJf6Or+vDCXlokfyiaAV6l2j5O3OkkKbCLS/bHtU4UNagKKVgnUVH5mdV95H8NEgvtfNLmIDAJduAvhdsLlPmOqRx6FKhc8Zt0NJ1/0nDhm++XDIVCqDoJQEYnRWwfDrEi7kjWXPXOUtMBMnx6j7fHsOI7HvLd2mlzOPRSz/YJQaTsvXjD56PRe6ydWbttnuNrwBcygYJNb048n2vjXuVphFMuw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cypcMMc7DT7ielj1Ua/mQbk7/usgY1aALztnXnFDX8k=; b=meX24XYzUhOlhtYBcB552WZMshvZhTgleOWUw0awxI9ZMXeAdkwzeHaXc95ZUbVg5iJGRi7tqdEWPy0cxL8g8IZbR2+VdCQdCYuG77Pmki9goWp7pBFz9nKMl+7vHUVeTLnVP9YMYMvvF739hMHJLFe8h2uq4hfkVMR32bvGEI7+dd+JV3lh1ViPV9HKd/Locc/vRKNpbEv/tTyX6JGRYJX5ow/ivXjfuofGG9NZHFVFRehgO289sfO7BrjqAhYbVFjjzIfVpWl4Z5Xt+vXjigTpjNd32oMQxkuWWErrqJhvvE9KcAvUv2+qpb7354T7Jv2pcSrxaDfmnSCWZgBimw== Received: from DS7PR12MB5744.namprd12.prod.outlook.com (2603:10b6:8:73::18) by SN7PR12MB7300.namprd12.prod.outlook.com (2603:10b6:806:298::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Fri, 12 Apr 2024 22:59:14 +0000 Received: from DS7PR12MB5744.namprd12.prod.outlook.com ([fe80::dc5c:2cf1:d5f5:9753]) by DS7PR12MB5744.namprd12.prod.outlook.com ([fe80::dc5c:2cf1:d5f5:9753%6]) with mapi id 15.20.7409.053; Fri, 12 Apr 2024 22:59:14 +0000 From: Zi Yan To: Yang Shi Cc: David Hildenbrand , linux-mm@kvack.org, Andrew Morton , Matthew Wilcox , Ryan Roberts , Barry Song <21cnbao@gmail.com>, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/rmap: do not add fully unmapped large folio to deferred split list Date: Fri, 12 Apr 2024 18:59:12 -0400 X-Mailer: MailMate (1.14r6028) Message-ID: <2BE605BB-F474-48E3-A54F-1E9371BF59E5@nvidia.com> In-Reply-To: References: <20240411153232.169560-1-zi.yan@sent.com> <2C698A64-268C-4E43-9EDE-6238B656A391@nvidia.com> Content-Type: multipart/signed; boundary="=_MailMate_38F5DF45-4531-4E6B-8DEC-9EBA48DAE181_="; micalg=pgp-sha512; protocol="application/pgp-signature" X-ClientProxiedBy: BL1PR13CA0124.namprd13.prod.outlook.com (2603:10b6:208:2bb::9) To DS7PR12MB5744.namprd12.prod.outlook.com (2603:10b6:8:73::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB5744:EE_|SN7PR12MB7300:EE_ X-MS-Office365-Filtering-Correlation-Id: 04ea66ad-f10c-45ce-1b46-08dc5b442c78 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kXcH1hhySTDgYiuCeBUY7rIgEiN7WgnGh2oBOm6Z5w/AKTOmdcHFdmdcYNh7v7CmCUR07PLPZROwSTQCt+2ypXbyH0SjANye6tOT8by1PEPKYYLdlYKQYifN4P28mhh06tkLj4MxNLsKGJ1D4wjJpmD+pt42mjMakMxd0Er/CT12ooOCIE4J1h2Eb877Ktk7t6sRJpLYfh2QCVjDBTvkQ5V0XzwlA0Q2Cq+IFC2pawKrVfwsbBX1rlCq1v0jOwy9EAQkSb6quFnVqO/B3ZEnDQLBHAmC2eYhwAstJqjQ9iP0hMt+9oYKhnnq2TXVW7b+f7FAmZmiBYgGGA7Zf9Xahb38Dj3UUrm28TWuRNw1DVWDVxZClnHH/4MLFASq5pCqgnV72U32O7LXS/BvNUC6fWn9pKpGp+3odRJB2o7uK/LWJrBXIZK5iLaJiHRj3qeNi2ysmoDaBS1kVsBhJbjgE/WMS+EIAwvVpT/BHyH+wg6aD6trzvqYIrOqVwMwaL4kmQCEiwuagx1iuJsmboxYU2y9cdv3+fnoEZoyRdpYI0ajzuXoC/eYiB+ETZ/ICisyqnbqAogY1LJAp02pgtbIzx6vwYD6ggEZfsnJjNFh1XDzq9DfRBNTg22e6QmW760vzCL+lifjoG8S1djU/5R2I8MdcHABTfV+XOPR6DF98tU= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS7PR12MB5744.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376005)(1800799015)(366007);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?L0VXUDZmUExXSDJIMk5xSE9JMmVLamhra1paS01ZUGJPT1hjMkY3bTNvRDAy?= =?utf-8?B?S0pla1o0N2NOUFJ6UnlOSHNuYmUwNVZkN1JSRkNwemdob1BhU01NNzI4enJr?= =?utf-8?B?cXo5OXBwKzJhS0MwT3JXRDZwNDJTMlM5RVRYWGdYNDQ2VW9FWkNieEprbWQz?= =?utf-8?B?b2VnM0d5cnhZRWdwNG83Mk41RUdLbFdHS0l3THJsTGF3ZWMvTzZnekdLTVc3?= =?utf-8?B?aTVVUzFlNjUzVVZnZnBmN2xxNzlINVBvRm40MU8rNGhLOVlTT1NET3lFZC9o?= =?utf-8?B?RXhtVFNrVDRlQURVQ0N2OTMrRmZXMHAyd3pHaFR3YXJSVVZyL05rN1lycGhW?= =?utf-8?B?Y3Q4THpLb0lrait0WWY5eWo1NUpJU2xHNno0VnJQTXVTcGx6NlhES0RGVTBr?= =?utf-8?B?M21ndXhrdjMra1AwUWZDZ2NQa3F0THhlLzdISE5VT0VWN0o2dDNmaGRpSEx6?= =?utf-8?B?akFLWmhraXgwUnMzMk9MdUgvNlRjYXoyVlowbjBMK0FzVUVEcFNNV0lSbnM4?= =?utf-8?B?VkQwQXgzb204SWoyR3daVW5xcXY2MFdCVnl3Tmxvc2dhU2dNZ2pGQTFFcUJh?= =?utf-8?B?aWRERUpUMFloWmVHdlpXcHlnMndXMitJUGwvRks4MU8zZGFFSVlESHQzTUI0?= =?utf-8?B?S3hSamYrcEJad3hqcnVuM1U2USt6cUFDSUFDcWJxSmVLcmUxbDlKV251YlJu?= =?utf-8?B?c2MxMnZKVGpXOXpiQU1ZZnhzV1FGU095TTBQSllCd2xCbnVWWURKK29PNlE1?= =?utf-8?B?dWh6RUdaaGlVd09DbFFpZm9WYXJpSFBrNG5INkVCbFdtT1JDZFNuV2NyYktS?= =?utf-8?B?MHREQ0U3RG8xc3hFM1p1dnNwSEp3NFR0ZHN5UUdZcGZ6YlRjWmV2TWRkNkht?= =?utf-8?B?a1RCVEZtQS82SHQ5ZjJKRmNPSU9lcFh1a1Fob2FyaUMrTExWeU1CQnRSS1dL?= =?utf-8?B?MlVxLzRRUUljMmg5SnV0TG93Um1nQmVSYWlpNzRRZHRQQ3BXcDQ5UWU5L05l?= =?utf-8?B?Ky96dGpYMGZ2d2pHLzE3U3Y3T0Z5K3dzTDlKd0ZuVVI2VURiQVo0UWZOQjNz?= =?utf-8?B?UmxoY0puR0RUeGdUSDBtNGhFMmtwYTZCekFGNGMzZVQwNXBONzMzdGNEWm95?= =?utf-8?B?d0gxakFWVnBKQ0VkY2tUekRnUmZEU1JwcDYrUmlSVm5PNS9LaUNTdDJlQ05G?= =?utf-8?B?OVRnQkdYMmZOZHErckxvWDA0NjFzSE43Y0Fack9UTU1nZU5tK2l3amJLWDJG?= =?utf-8?B?U1ordy9BcytrOVQvaGY3Vjg1Z042N2czM0JXMTRhVHVUTGhTSndrSXRpNDI5?= =?utf-8?B?TGhjNU1OeWZSMmZDdHkvQTk4SjZFNzB5TlpUbC9JWW80alI5V3BIMXM5cG16?= =?utf-8?B?azEydVBwb29mYTNOK0IvTXVjb1RUMmVzMG9GRG5QYTFKKzF3REdQOFZocW1p?= =?utf-8?B?TkdaNHJ4ekFqMVg2T0RUNG9jVEtNbDFoKzdEWUtUTDl4Yy9UWU0rM21YNXB6?= =?utf-8?B?OTJsa3VvekpnT2pxbGdzTGRtZFhCdkRpbjBqdUFFeXhjREVudm0wUkVPcmhy?= =?utf-8?B?NlhZdHBieFE1N3JnczYrR2dhM0NnYll0YUZ6aEkycmZFQnUwajQ0M3d6NjVD?= =?utf-8?B?UlNQZlQxRUtiWklUZFhIOFNzYWM3b2Fnd05xNTJZZjlIcmJ4cW9IdnZESnl3?= =?utf-8?B?d3JaWjJhMXJ6S1k0dlY0WldxTFFMZTRpUGJyVTVmeXBHTWtOeXhKd3lSb3JJ?= =?utf-8?B?MGJDL2xYZFlxTHY4N0FiNXVCeUJ2eVBZYlg4Zkpvd0xtTVBpU2xxcEhpai8z?= =?utf-8?B?S3Fwdm1NRWlRTFFNMmNvRldma0s0d01nZlRVZWpIN0VCV0o3NGd4Y1ZaVWt6?= =?utf-8?B?Zk8yOGJPTHBQRUtsYTZUeUExWWF3Q2Jub254VWdNSW1BUDJoT3pFczNNRVFs?= =?utf-8?B?UE1GaXpPK0lLaWpDNWxmcmNmRVdPQ3dWOGEvMFVZdThWaVZFTmhkZmN6NXE1?= =?utf-8?B?azc2VFYvb1V3WXNGaTBrMXJiNGlnY0crNHk3NHBCQkJWeVZxV093UEpVZ3dx?= =?utf-8?B?em5VR05GZTBhTHdqQmd3MmxoQlhhelNjSy9VZlBEbG5CUG1USjRyTE8xV0E3?= =?utf-8?Q?lCbY=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 04ea66ad-f10c-45ce-1b46-08dc5b442c78 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB5744.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Apr 2024 22:59:14.2731 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: +AjZB0v3FKbC1G9vgFGNyykryZh47cUDhXnfPr34oVvY7RKWfD1aaFWrtgZ0BC9O X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7300 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 09FE440002 X-Stat-Signature: 51m5eg16ujh84sgty7rrkrkoichfexb8 X-HE-Tag: 1712962757-17311 X-HE-Meta: U2FsdGVkX1+GmAG0U5ODZQjUwsTHcEKpIz+HHZJhzTjrUmHGFWCjzdHZJsDtw5/uc0Yb+wo+hyTQZhxZMgz4yVwwCxs9PBeI9GjjtuEjR/o8iKWpHUYMcCoSgMpxsQ66FCAAQAqeIrvwSrny5dH8OpC6p0jAUmB++3Hpeut8C1O6QsXsYLSeSU6NVXdj8vb+NxxmDs5IRzzKMFZ82F+sOhWEIE7SQpyCWRjfBSmJRxkymlPS6LdxzymJS8nRzjq1ZcmG1CJN9CnrpFHCffniPCBemLq9pyyy+8xd5zTTh6isccoNB/aWRbcj66Gws+N5COv+K/BQREd3jCzIj4Ip6II6T1W4BvvMqfuaQ4lI4uL1OXcQkAfewTUtjMFCNODxS6RVfIO1hTjPO/ROFbR4pxjt0Pr2FQt0B5sssZyVeHidwD6TNaSj7iDGR+Ju7sKZpS4lyNvFSrinKpEeJ66ON3UP07mxC3LH+JHslRJAsYL2f7SDWe6ous5FHG8cG5361RN3LhCHP34fKmNZID5oY3uW05XDVhIofcilNbUtz5VM1zxkovBgcDEftWHPLJC68E2T/gS0M3GWRB9ZOGJodHbRw2K043aGkMSjKSpmINqm+4/aUHiil76/PCVnEcUM8lm86+wztqapbktEg/typR869ZZ0ZNNm3tNNEaiaEJSUHgxnvxQNgZ/ee5U99/V2fRIj9Kx9s5RZaiske7HMe8lmHmwadlgh5MpWJCBe4UAPQDQMTdQMobh/UxEkQ3C8+r3impXHUq5jMh3PrKFjuEYAmJO+696AmMN/6Rx2tC+8oPhBEt0uTP8/8WUdJ7Yo04rGeJk1nUT6YxItuuAUoPZnXPRkdYqf0S+/aIVrC09YXbwmh6bsZ/zP5T2Z+fvsnTgMZ5XFtUZGyrowx0b45PjvEEgEjBhZkzlFVFwH2IxTIDKcd9GP0VB2gpevZRh1BphT4ZrUNEHD1H1RLEy 5i1pukQ8 DlRJHCy+XV84/ekx8eAbfCRzgQQZ8DUPodeU2tKSLLuCk3fdMjMHMINGvBYfjiBGF3T+fitsSlCpPzusYySHZ1A5WhQtDk0jMy/4Qm9c/W7hJXVbDY1Q2AUnb5Eh7OkxWX0F3E+cB+nHAdX6NxinMe1yCwYIF62KAMLMxLODT0zFf7cd8BMfOiK4R97VKbisL1KxT4ylxsi1z6IukihGXXcWr1ZEO44AI1UWExviXuHm4L89airERowD5pCT9B9UK5BK8gJCcHbthba0QmCufSXBeLVGqsDhYeWIXvVPZttdC0bOJY6pQDgrPuCACe+DNJaMVYzMbONdckkomYqDxECgDtQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --=_MailMate_38F5DF45-4531-4E6B-8DEC-9EBA48DAE181_= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 12 Apr 2024, at 18:29, Yang Shi wrote: > On Fri, Apr 12, 2024 at 2:06=E2=80=AFPM Zi Yan wrote: >> >> On 12 Apr 2024, at 15:32, David Hildenbrand wrote: >> >>> On 12.04.24 16:35, Zi Yan wrote: >>>> On 11 Apr 2024, at 11:46, David Hildenbrand wrote: >>>> >>>>> On 11.04.24 17:32, Zi Yan wrote: >>>>>> From: Zi Yan >>>>>> >>>>>> In __folio_remove_rmap(), a large folio is added to deferred split= list >>>>>> if any page in a folio loses its final mapping. It is possible tha= t >>>>>> the folio is unmapped fully, but it is unnecessary to add the foli= o >>>>>> to deferred split list at all. Fix it by checking folio mapcount b= efore >>>>>> adding a folio to deferred split list. >>>>>> >>>>>> Signed-off-by: Zi Yan >>>>>> --- >>>>>> mm/rmap.c | 9 ++++++--- >>>>>> 1 file changed, 6 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/mm/rmap.c b/mm/rmap.c >>>>>> index 2608c40dffad..d599a772e282 100644 >>>>>> --- a/mm/rmap.c >>>>>> +++ b/mm/rmap.c >>>>>> @@ -1494,7 +1494,7 @@ static __always_inline void __folio_remove_r= map(struct folio *folio, >>>>>> enum rmap_level level) >>>>>> { >>>>>> atomic_t *mapped =3D &folio->_nr_pages_mapped; >>>>>> - int last, nr =3D 0, nr_pmdmapped =3D 0; >>>>>> + int last, nr =3D 0, nr_pmdmapped =3D 0, mapcount =3D 0; >>>>>> enum node_stat_item idx; >>>>>> __folio_rmap_sanity_checks(folio, page, nr_pages, level= ); >>>>>> @@ -1506,7 +1506,8 @@ static __always_inline void __folio_remove_r= map(struct folio *folio, >>>>>> break; >>>>>> } >>>>>> - atomic_sub(nr_pages, &folio->_large_mapcount); >>>>>> + mapcount =3D atomic_sub_return(nr_pages, >>>>>> + &folio->_large_mapcount) += 1; >>>>> >>>>> That becomes a new memory barrier on some archs. Rather just re-rea= d it below. Re-reading should be fine here. >>>> >>>> Would atomic_sub_return_relaxed() work? Originally I was using atomi= c_read(mapped) >>>> below, but to save an atomic op, I chose to read mapcount here. >>> >>> Some points: >>> >>> (1) I suggest reading about atomic get/set vs. atomic RMW vs. atomic >>> RMW that return a value -- and how they interact with memory barriers= =2E >>> Further, how relaxed variants are only optimized on some architecture= s. >>> >>> atomic_read() is usually READ_ONCE(), which is just an "ordinary" mem= ory >>> access that should not be refetched. Usually cheaper than most other = stuff >>> that involves atomics. >> >> I should have checked the actual implementation instead of being foole= d >> by the name. Will read about it. Thanks. >> >>> >>> (2) We can either use folio_large_mapcount() =3D=3D 0 or !atomic_read= (mapped) >>> to figure out if the folio is now completely unmapped. >>> >>> (3) There is one fundamental issue: if we are not batch-unmapping the= whole >>> thing, we will still add the folios to the deferred split queue. Migr= ation >>> would still do that, or if there are multiple VMAs covering a folio. >>> >>> (4) We should really avoid making common operations slower only to ma= ke >>> some unreliable stats less unreliable. >>> >>> >>> We should likely do something like the following, which might even be= a bit >>> faster in some cases because we avoid a function call in case we unma= p >>> individual PTEs by checking _deferred_list ahead of time >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index 2608c40dffad..356598b3dc3c 100644 >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -1553,9 +1553,11 @@ static __always_inline void __folio_remove_rma= p(struct folio *folio, >>> * page of the folio is unmapped and at least one pag= e >>> * is still mapped. >>> */ >>> - if (folio_test_large(folio) && folio_test_anon(folio)= ) >>> - if (level =3D=3D RMAP_LEVEL_PTE || nr < nr_pm= dmapped) >>> - deferred_split_folio(folio); >>> + if (folio_test_large(folio) && folio_test_anon(folio)= && >>> + (level =3D=3D RMAP_LEVEL_PTE || nr < nr_pmdmapped= ) && >>> + atomic_read(mapped) && >>> + data_race(list_empty(&folio->_deferred_list))) >> >> data_race() might not be needed, as Ryan pointed out[1] >> >>> + deferred_split_folio(folio); >>> } >>> >>> I also thought about handling the scenario where we unmap the whole >>> think in smaller chunks. We could detect "!atomic_read(mapped)" and >>> detect that it is on the deferred split list, and simply remove it >>> from that list incrementing an THP_UNDO_DEFERRED_SPLIT_PAGE event. >>> >>> But it would be racy with concurrent remapping of the folio (might ha= ppen with >>> anon folios in corner cases I guess). >>> >>> What we can do is the following, though: >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index dc30139590e6..f05cba1807f2 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -3133,6 +3133,8 @@ void folio_undo_large_rmappable(struct folio *f= olio) >>> ds_queue =3D get_deferred_split_queue(folio); >>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >>> if (!list_empty(&folio->_deferred_list)) { >>> + if (folio_test_pmd_mappable(folio)) >>> + count_vm_event(THP_UNDO_DEFERRED_SPLIT_PAGE);= >>> ds_queue->split_queue_len--; >>> list_del_init(&folio->_deferred_list); >>> } >>> >>> Adding the right event of course. >>> >>> >>> Then it's easy to filter out these "temporarily added to the list, bu= t never split >>> before the folio was freed" cases. >> >> So instead of making THP_DEFERRED_SPLIT_PAGE precise, use >> THP_DEFERRED_SPLIT_PAGE - THP_UNDO_DEFERRED_SPLIT_PAGE instead? That s= hould work. > > It is definitely possible that the THP on the deferred split queue are > freed instead of split. For example, 1M is unmapped for a 2M THP, then > later the remaining 1M is unmapped, or the process exits before memory > pressure happens. So how come we can tell it is "temporarily added to > list"? Then THP_DEFERRED_SPLIT_PAGE - THP_UNDO_DEFERRED_SPLIT_PAGE > actually just counts how many pages are still on deferred split queue. > It may be useful. However the counter is typically used to estimate > how many THP are partially unmapped during a period of time. So we > just need to know the initial value and the value when we read it > again. > >> >> I wonder what THP_DEFERRED_SPLIT_PAGE counts. If it counts THP deferre= d >> splits, why not just move the counter to deferred_split_scan(), where = the actual >> split happens. Or the counter has a different meaning? > > The deferred_split_scan() / deferred_split_count() just can return the > number of pages on a specific queue (a specific node with a specific > memcg). But THP_DEFERRED_SPLIT_PAGE is a global counter. Did I miss > something? Or you mean traverse all memcgs and all nodes? It sounds > too overkilling. I mean instead of increasing THP_DEFERRED_SPLIT_PAGE when a folio is adde= d to the split list, increase it when a folio is split in deferred_split_sc= an(), regardless which list the folio is on. -- Best Regards, Yan, Zi --=_MailMate_38F5DF45-4531-4E6B-8DEC-9EBA48DAE181_= Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQJDBAEBCgAtFiEE6rR4j8RuQ2XmaZol4n+egRQHKFQFAmYZvMAPHHppeUBudmlk aWEuY29tAAoJEOJ/noEUByhUALIP/2AJjtFPn2bhUalps2I3KXaT2e3UlfzBeCfq y8AT2o8oOzI4YNOoJsGdIihkJbruz84EL6feN1uJsypAqThpKGAKrC+M+/VZZa6P EPFdzkXnLG+sZLuWf8a+t6WLTx6/sSJGsOgyIcCJc42WnqCc/r9LqImAUuCcY4cn 1H2W8ddV/6hMJw/lpXhAUZP2tTI2mQd1SYAjCBGCkHe+gXeI9ybnkb0iO3TCtj7t apgGC9pOa3qD9Ub46SMLv0pwLoX/KRtvto9BqnVBXksyB8yo4IAw9QVNkIFWxlgF hn3oTUz/Ve/bhrlfMqX69pa5ZWz0YL8nKTRsSLYnZTzNPXjGdc4uyfIe05x7lZm+ J8WqotyY4YzI1KXgk6RLzhwgkx5Mx/dy/HmD3Qzat7zE2dinBRUUqhS70L3b31Bq lINMV+Q5B3sf0eS77RR7gvbLj/grUFtkbSO3WI/QwGTh6IsDywL/43qBmcSLfqQ0 IN6rSuIfZ0hvqgNc9VYlxW44gj9lS+Tu3d7sXoF0XzR/zjq9bcIJFM5MUrt+p2ML OLzsPUQ3atyO/gey7lFXkbnXzI9t0SF9gc3ekpnvHc49NTQQCNe7TCTtjKBUA3mU Dk0ZLFlDXK5KoYfOHyfDSbTA6FTWc0GVcUCVSEI7kIW3akqOnMAAqNuTC1NIkZVP 77CUFxX3 =GY8I -----END PGP SIGNATURE----- --=_MailMate_38F5DF45-4531-4E6B-8DEC-9EBA48DAE181_=--