From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D009C3DA45 for ; Thu, 11 Jul 2024 05:43:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E23D16B009F; Thu, 11 Jul 2024 01:43:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DAC686B00A0; Thu, 11 Jul 2024 01:43:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C275A6B00A1; Thu, 11 Jul 2024 01:43:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9C99A6B009F for ; Thu, 11 Jul 2024 01:43:37 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4C311403CC for ; Thu, 11 Jul 2024 05:43:37 +0000 (UTC) X-FDA: 82326379674.21.C44465F Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2053.outbound.protection.outlook.com [40.107.94.53]) by imf03.hostedemail.com (Postfix) with ESMTP id 4775E20006 for ; Thu, 11 Jul 2024 05:43:34 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=Kvxs26Pr; spf=pass (imf03.hostedemail.com: domain of bharata@amd.com designates 40.107.94.53 as permitted sender) smtp.mailfrom=bharata@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1720676597; a=rsa-sha256; cv=pass; b=pi+gUlRiUvtixNAJpDZJuMzWSiTMAIYgi6ldZskhluPImRkgAbrLvymYbdQuFju4LxxgTp h7Ju65tIdDFxUZRihqfsf/USSAPnOHr3lleODo1fSNE1VlwPhWWS0prVO6fcZ+mXmx0iGk mSK69sxC34i8xGANJ5Oq7QkBKUKil8A= ARC-Authentication-Results: i=2; imf03.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=Kvxs26Pr; spf=pass (imf03.hostedemail.com: domain of bharata@amd.com designates 40.107.94.53 as permitted sender) smtp.mailfrom=bharata@amd.com; dmarc=pass (policy=quarantine) header.from=amd.com; arc=pass ("microsoft.com:s=arcselector10001:i=1") ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720676597; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EVGu/uzih6546yMDrtugJxC9+QIYk0fffVYF/Fu5Ips=; b=qFjuomrpKnMe6Vt1I3YZ+TYzlxFr9D3fgdUrJMHLFyGQBhN7vYNi0RNed1QzoCI2psnVkh kOaXqzSpvI5brDGn/JQDzXJ1qcYCmsARSmgthsaBIQvr5W496liEzwX2zaeR0ap9Ss/9E8 au8tM3/0L36W/4rXK4zx/4qhNIJ8kNs= ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=v7Am6TVY6cBQK0RANvn3JBuwhht2Szp6DPaNQ2i3CN2GVkm1+J0PxRJ2k4zPioADIDcRu5ftAAdef9mOgcJYTQxoyydM2p4RC2MtkIhSYW2Xqc7HyY1xfizygO0bRGK2AvsVXNtYnt5t9vgayM4zGrBL3DKE4uCAIhMwFO3pijl5r51WnhJS+GzG7XxmHeipQliFhQ5RiU7XA8hBeylDbtfeFcPg6ZPVAA2H3M+zcY/sDuzJ05iuSmYm78F0SNcXnvCnLI6UKvl9ot95XnIUEG9EnPzrtQt39gQOy9U/qBJ4zsRWt3u9+kAsX8REk5Wi1Wy6EEMTPAphwh/ku+g04w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EVGu/uzih6546yMDrtugJxC9+QIYk0fffVYF/Fu5Ips=; b=LCBZqzDVXQco35SUac1liy6s9xu/4iSG+WwabAbils06Kq+NPau+v3c0zUDoY1VC7HrU9lev+wjR4QJfYiwx9eAJrYMg+AGmOaXvWWrwdh004mE428n7oCTEWvPywRrgnpIc8/IY5uW/ZunL+t6V7sSJZw9X782iUT0BErbbDkD1Vj879Z+xJDdSLbGclrHWGTN4oz/NHnCmwHXheWx46UwRENmd7X8OL2S2zpKtsa4uSCaehNmoFFRAMZmckr9p4lhBktfvEWqLK2LFuyybG9KpSgG+gY7m78Hyt7Jw2ae4mtC7wbmGGo12YlU0wzjxhYyHleNPKWI2oVb/rR8ijA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EVGu/uzih6546yMDrtugJxC9+QIYk0fffVYF/Fu5Ips=; b=Kvxs26Prv1v16CRqsJ5n4WNHJcKGYZs9RoHmO1kDf+2qH2zy9qmHSBD3hKyvvzuEXDhm1nyhVyYBsrSF3iA5nh/4TyKrJpZGs31DZUJPac3+WYw0DLMYDi/hqA8HR0HsW4/IX5Ey1a4x+tbaB2gy9VCkQXBjfbVyxSkzx/TUM0w= Received: from IA1PR12MB6434.namprd12.prod.outlook.com (2603:10b6:208:3ae::10) by SN7PR12MB7322.namprd12.prod.outlook.com (2603:10b6:806:299::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7741.36; Thu, 11 Jul 2024 05:43:28 +0000 Received: from IA1PR12MB6434.namprd12.prod.outlook.com ([fe80::dbf7:e40c:4ae9:8134]) by IA1PR12MB6434.namprd12.prod.outlook.com ([fe80::dbf7:e40c:4ae9:8134%3]) with mapi id 15.20.7741.033; Thu, 11 Jul 2024 05:43:27 +0000 Message-ID: Date: Thu, 11 Jul 2024 11:13:18 +0530 User-Agent: Mozilla Thunderbird Subject: Re: Hard and soft lockups with FIO and LTP runs on a large system To: Yu Zhao Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, nikunj@amd.com, "Upadhyay, Neeraj" , Andrew Morton , David Hildenbrand , willy@infradead.org, vbabka@suse.cz, kinseyho@google.com, Mel Gorman References: <1998d479-eb1a-4bc8-a11e-59f8dd71aadb@amd.com> <7a06a14e-44d5-450a-bd56-1c348c2951b6@amd.com> Content-Language: en-US From: Bharata B Rao In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: MA1PR01CA0148.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:71::18) To IA1PR12MB6434.namprd12.prod.outlook.com (2603:10b6:208:3ae::10) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: IA1PR12MB6434:EE_|SN7PR12MB7322:EE_ X-MS-Office365-Filtering-Correlation-Id: 8100076b-c3a0-4e6a-6fe7-08dca16c6357 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?utf-8?B?QXlwYnJxZ2U5WEJvZXhkbklIWjFXVENaR3IwTWszWjBmZUF0azFObmRTelFz?= =?utf-8?B?UDdTQ1lsRllKdUdlYTU0UksrUXpxNnYrWDdMNWpZMW03c1F0cVg4WVpTMDlx?= =?utf-8?B?a285WEpiWXNRVW91NElrdis2amNJWXNxTEtoWnpmVjRoVWQ3aWJ4NGRWUUMx?= =?utf-8?B?ZmlRU2Z6ZTQ2ZFFTM09YdVpVSnk4NlZnOWU2UXNjejVSekY0MElIOHJvTm05?= =?utf-8?B?N2hDTHp2SWdVS3NnZ3p1SDA4eUVVQ3VaczNOYjJaSTBrMTdrVnJSRHMzZmMr?= =?utf-8?B?NEE4M0N2L0pXTG1yZXBEdXUzZXNxVVY1MVJtdG0vK1kycExoMENNTUcyQVd0?= =?utf-8?B?UndEdWk5QnNNN3FRSWUyNVJwTGhMTUpCcDF1ZFQzMnlVZVB4K0RzYkkxNmJM?= =?utf-8?B?cHVVNnZIVVJTaWhzNS9aRXFYUFdrb0pFbU5jZk43eEZNN3k2QWFHL1JxcHJa?= =?utf-8?B?NHdmVk0waFY0K0NWNEtoVlV6R2dOaHNEcGhlSlhURDZlZTUrL2hzRUNkNitq?= =?utf-8?B?cVJrU05rNnhNNS9nSTRtcUMzNThCSkhkUk8wWWorWHNzU0U4QW12Mys1OHl4?= =?utf-8?B?Lzc4WUdEanVUVFRheHIwWGNDY3VlNzFaR3REK0RUK1J0Y0JiVzZPV29QQWRS?= =?utf-8?B?SHpCdUE4Y0FZTlFERmllU2hSQllCeFliT21wc2NJOU0wczZKMlk5QVdIWTZQ?= =?utf-8?B?N295cnp6TjFaVTZSTU0xWGZmMno4OW03R1Q4QTJnV3R4YVF5VFJTdzk5RWY1?= =?utf-8?B?ekphN0cvRTNGVHh5bFhtWHJ0bkZPRVliQmhXa2cyZHc4QkVKQXNXa2J2emRZ?= =?utf-8?B?MStuZlhwb05jbTQ1NG52ZkoxZk1BUXRzSXFseWF6enpseEhqZnZrU3BCL2VH?= =?utf-8?B?dEdyM3VNWlhiQXhpRFBDdzFGL01Ncit2T1VuVFR1WWtFKzd2MXVNTlczWSs1?= =?utf-8?B?RzV0TGZLSGxqeFJWSFpCdWh0czRmanVwQ1NsZTFNdzBZejVCL3hUcFZ1aThj?= =?utf-8?B?Qzg5QzZsdkVWRzZPaEx3Mms5Mi9jeStmeWZ5ZE44ZUdnMjlYMVp5WmdjWUVO?= =?utf-8?B?Q0tldE05SWlqcUZFWTFpZWpJSGVuRXdtVWE5NUMyTEc1R0E5a1VNYm1tUlFl?= =?utf-8?B?alNGOXo5UnNYMXRZbW1MNnZzLy9VQThBUzF3NDNRMkdtaEh4MFJCdEI4clBW?= =?utf-8?B?dGpza3J5WW9iQ0tIQUlZU0k4UTNDUnlYNDc0Q1ZwZWU5OEZXQk5nSzN3UCtM?= =?utf-8?B?SXYxQmFEbkhmZlpLTmxHOFdkS2pwMGdwQjY5Tk10Q3FyamFheGZuN3dTS0Iy?= =?utf-8?B?dHFGc1FKRjdEUFVNcWlPRXlLTlkyL0lTV09NaExJdjlpVnFzcklKRW41dGdQ?= =?utf-8?B?bC9ReUlEZDRCTXU2LzNmMlZsenBlczAyaHY5SHlZc3BZT0JYWkxwQytTVmc3?= =?utf-8?B?SlQ4ODJ6N2RrUlhMTDNDRkNEVExkRVZzaVN3YXZicnUwTnFCb240cFRIOTBE?= =?utf-8?B?TEw3Y1N4d1FkM0dSRFM3QldjWDl1REpQbDlBUkdFN1Q5bnU5MzJGeHAzYmJJ?= =?utf-8?B?cGVFQURCN0tac05JbEU4N1VPVmRpZnd1TjF1UWVnVkdJMnhjOGh6K2pxVnFo?= =?utf-8?B?d3hqY0FrVWVuY05xM1diaDRJbk5rT0tBdkN0eDUvZGJ0WHFoUnhOQ1FZdzJa?= =?utf-8?B?Z3VHREQ2TitwNmtGTUI0QzhHb0tCdTFPbFlzUjdIeklVUENGV0tQSkVlaURi?= =?utf-8?B?Q0JJUy9KaWJBcUNJT2hGTXhOakxGakwvbzNoRjgxY2VQeEpXRmhNWmZBWVI2?= =?utf-8?B?cUJtOWJIQ2llVG1iQjlUUT09?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:IA1PR12MB6434.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aG5XOFJSUzZoU1pBUWowVG8zZWprZEJIa0tRMUQ3Wk41czZqaTgyNzcrMThI?= =?utf-8?B?UGJpbGZSb21wM2lUS2lOVHdlYTNRbmtvczUxV1VLYmI4NDloSTYvcHo2bnZa?= =?utf-8?B?SlhVbVZQZ2x3K1JLa0UrTGcyckI4Z1V4c3p5VThDL0QxUkh6dzRqS29HWDc5?= =?utf-8?B?aFEvdGRsQmdsaHdVNm95aE1mWFE0d284R1Z1SHRKTTNDay9jNzNma3FpTmEz?= =?utf-8?B?NURsb093dCs4YnRXMStCMWhUc3VlNGo1STY1dDNiRlRmSDNHMFdwcWhYRThw?= =?utf-8?B?ZUY2WWRHeVMrZVF3TG5Fa05QdHVTbnJMSlk3N3l2R3pjc0V5YW9RWlpodFVv?= =?utf-8?B?YzQ1Yy95WmM5Vk1tWE5Fc0UxQndyUGdFOEYyeGlNVnhVdXN6ZUFkRFBFZktk?= =?utf-8?B?NDBsb0I3RllJc1BSYXkyUFdhMDJJQkRuNk4xcDJhdjVjcUhjV1kwU1pCM0Rv?= =?utf-8?B?WnRIazFISW90VUhWZHVCOFdRT3o1amJOcDRHTUMxbTRRZ2xOOUo1ZkF3Rldk?= =?utf-8?B?L0FQVHRXV3VpVmhlY2RncW1vbVQzcjZRM2liSmZEQ1pKUCswbnM4eml4MG9F?= =?utf-8?B?MXg4LzZWdHg3dDlCSXB4V1JzZzZsKzNhUHM3V3NUVmp4U09iYXhRWGFOZmlj?= =?utf-8?B?cnY5amVjcithWUNnbXZ2NUdGUVd4dURHc2tVTm5DR0pJT29hcTRGU0phbkFG?= =?utf-8?B?WjgxSjc1NFROQi9BV0lDcDUrVGpPd2hzcWRnclpWa0UzaDRadHJCd1o5djlr?= =?utf-8?B?amh2eWhFVGI2OUZKZURkaGxkMmFIdGZRQ0hsR1ZraEx5dWt0d2hkaXJRZndO?= =?utf-8?B?c2pOSERRRm9mYkwvZm9DUnl4bWQ4aDNKZlJHbExYaU9CQVdPVmU3bXdnbFpm?= =?utf-8?B?NVdlYm40QmhhSjVTTUpZeUtqMU0xaHQvb3NmY0IzdlBJTXlTdFNsaUpDQk9O?= =?utf-8?B?T0pscVZnek5GcUdJRUdBOHNaaWpFR0NCRUd5OFVCNnVSVDNTenZrQkZTSDdW?= =?utf-8?B?eGoyTCt3bm9tZGFPMmRqVEp4MEI0elphc0JJcVpScEVaemdEWVo2QkxGLzBh?= =?utf-8?B?eGF2cFd1MENyZ1BjUVNzYmxCbWx6MDRyOUpmUU4xd0JYNGd2RzUwOVVib2V3?= =?utf-8?B?Vzk1NFV5WGlTYmtSZFNONmd1TjNSdkRCbTIwcDB0a00vVjNyZGNrWmVUTWs5?= =?utf-8?B?Q281UEllNEJOWEl1Nlh3Zk5NSFVEeEVGUDIxUmxCT0MrMWRQSkIwS3Z6eGFa?= =?utf-8?B?a2wwc3pzcmx0MjIyRUJjdjB4eWQyeDBTSko2eDEyV1JMVzVOMzBWS3pWRGpw?= =?utf-8?B?TXNSYzdoQ2tHbWMzbmNKZFk3TGFBWlVxNU10Uk51VXlvbTdidkxQWWptakoy?= =?utf-8?B?bnR5UTJ5S0RFN1FLZEEzRXpJV25LSTlUa2FibHY1MW4wQ2NvRC82QkMyOE14?= =?utf-8?B?Z0R6RWV6ZnZPTjNPcnZTa0t5YnFQcVkvdm9Ua3BDTGpteW1RZ0RLNXBLTTM2?= =?utf-8?B?OVlzV3JKZVh3UGcxemYrYW9lTC8wVWc1akphR2lSQWE4amZCaE9adVlXME1u?= =?utf-8?B?STNNTDg0NkdrdEhpVWhIUkhUM21reXRoSFVVMks5T1VUNjE3Q0UxTHRlN3BC?= =?utf-8?B?dENIRi93U2JYeVJicTIrTXUvUFVEeU1uajNLRVlaSnl0K1pLN3ZpTW9qM2NS?= =?utf-8?B?RVN2NmpBV3V1d2Jkd1F2bzhoeVhMbHJaenNWUFRTeElHNlZwWDBFdUlmNmM1?= =?utf-8?B?OG1rWlUzWUxuVVNFeHdKRDU0eW01VHk4U0NDak0rS2Y5M3B3cGVQSTQxV2RQ?= =?utf-8?B?NjZqQ0E1WDlqNXljYmF5Wk1HRGtVRVgrSmpkSEhuODJqQ2hObEEzV21VMlVX?= =?utf-8?B?RFNJc25NY3pNcjNvVVVkZ040U0tzbkJjWVZKVmRqVlQ4VGxEZkNVd3M1N2dM?= =?utf-8?B?T2lCd25vVDh5aGwzUnVONUsvdUkwbzc4bk5HSFlGREhPQkp5OGR4WnFOZE5L?= =?utf-8?B?ZVFHMGpTUXYvcnAyUzhjSlNZZUtFMGMxVU42NEk5aUo1bS9KNzEzZHdzWjlz?= =?utf-8?B?MXV1R0RMS0w0bnlQZjBDQ0JjQTkrWVR4RFRyaElGYzZmN3FkeExtdHc5YkRC?= =?utf-8?Q?9IotjcjOfKOv3McXh7eN3fGrq?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8100076b-c3a0-4e6a-6fe7-08dca16c6357 X-MS-Exchange-CrossTenant-AuthSource: IA1PR12MB6434.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jul 2024 05:43:27.7969 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: YAXwzchiotCsV4ILcgJHpBsrhVpJbsb82k25uBITnD4N1H14m8n9Me9uRpak0rYaPwp4zo1G4CejrLJFjAz0Gg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7322 X-Stat-Signature: 319wo1uyk11fbbf4gxr9ob8fgt4twqoz X-Rspamd-Queue-Id: 4775E20006 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1720676614-987430 X-HE-Meta: U2FsdGVkX1+XhkcGvRH0IYlGYcHMUOSjs6XAAkomI4WNk95cYz+Zs6H5+nOKxZUDZjPGkAESQ+dyoJWFMbn3Yks7BayvWteHiuEhWtB89NXT9r0n+k54vmTmQ+8CoAXOuqiQF0fO+j2+ERyzwO3qy6L48p37bDQU3CsmmVtfFv7X5iR16Jj14Uh1A78lBjfhk4FrvcxOx04CEBzOPW8PeI0/EyZLqxZzn8ADWvu5nCzBFSTVaSLm6n3WZ9r6a8PejXns1womYlRBysTkTJAVy6WfSUlT9+aukZi8HPJcGWXyg4JL+TFq27+l/5GxdOo4sLKTsho2jE39LjJMN1xj9PuPZJiGYm2le+DQyvQVRa/rZXe//c31E1Jm0Jg12xoWWY8PZxweYsW3BVH6LQls8ZIFsGrsHWjXo6iXsrxHSVaU0Xgdo+xayjH5RNqs/llF6hSxpsr4uiA4Ur3YP0TMku5cOFMwF4KNSfzXXE8apYRZ6KWv3PkF1C7uB3tKfQ/LpmDoU12UVGSv6B8yhQSkt2SII4xTAm2xqfL7PDAKqxZ8BxV3Lq2NqRBgYG78zNkaG6WSg7S3Nt4XSMtKzp6a1U5Uwhh7ZTngdZak8toTTwIYPni5qqMUUp5DbjnEbpaXqWeDoYbUMR6H+1KHm+jVxAThY+814XWFa7pKBnC8F+cB1AsHw9ouiWOV3wn2fV4rfFXVyRTGsrelq9kWg3oCyrnO8ThPqJB//S29bsgEPuSKZWPGZp1YLVuSlcfMN2pXw5ElOWN4UkKEnggUIJJrJ4D+OymAUt0AJ1+Kp2l/6Wjrm4/+m8Fkip5sr4fFWQu1tBXCogcYDy2YzezgaKrdPAnyS6N0lFzcuH7gCF+ZnRaYUmqAr9C0Cc55vWXUaGFRDIxOMOJ27JWG5vuwKQ7qaBktEwHy7ICocZ1wC/C5rB33+3u/zMZqXeZu03Kzbl4Oz8p6j5AraKexgfRaMtX pt1UwTgV 7zwcGGHY2Ud18+Pz3J6c+K+r0y4zd8yaqW5bJSwuyuOzdxt0WQ4g4JjgGiQvoMbfhf8QZB4WTxT2aaaGNmeJ/qkmMRvZvDyJjL4S5DBJ+kKJ0D2UeW03jNtvT69VKxuSh8XtLMLd7ru33gkQ3rIBDvvb8Z8pnxF5wHcuOAPD4FNI9GXMQsYGXldQ6AQmomYLgzt5OiZnb0UOmqo3iAHOmlx0Y2bg1PhIr6lCqeUVWdW7wZA8OaktwmgEYvkYuGGpIlzirnA4/gLLKMKzDp7VO1XBgQxMZ45gLeWH23IU/pqYqWZjRd7vAzyYno/oJdnxTdNfFFa3dJItxjU4ZoKxyXXwdzw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 09-Jul-24 11:28 AM, Yu Zhao wrote: > On Mon, Jul 8, 2024 at 10:31 PM Bharata B Rao wrote: >> >> On 08-Jul-24 9:47 PM, Yu Zhao wrote: >>> On Mon, Jul 8, 2024 at 8:34 AM Bharata B Rao wrote: >>>> >>>> Hi Yu Zhao, >>>> >>>> Thanks for your patches. See below... >>>> >>>> On 07-Jul-24 4:12 AM, Yu Zhao wrote: >>>>> Hi Bharata, >>>>> >>>>> On Wed, Jul 3, 2024 at 9:11 AM Bharata B Rao wrote: >>>>>> >>>> >>>>>> >>>>>> Some experiments tried >>>>>> ====================== >>>>>> 1) When MGLRU was enabled many soft lockups were observed, no hard >>>>>> lockups were seen for 48 hours run. Below is once such soft lockup. >>>>> >>>>> This is not really an MGLRU issue -- can you please try one of the >>>>> attached patches? It (truncate.patch) should help with or without >>>>> MGLRU. >>>> >>>> With truncate.patch and default LRU scheme, a few hard lockups are seen. >>> >>> Thanks. >>> >>> In your original report, you said: >>> >>> Most of the times the two contended locks are lruvec and >>> inode->i_lock spinlocks. >>> ... >>> Often times, the perf output at the time of the problem shows >>> heavy contention on lruvec spin lock. Similar contention is >>> also observed with inode i_lock (in clear_shadow_entry path) >>> >>> Based on this new report, does it mean the i_lock is not as contended, >>> for the same path (truncation) you tested? If so, I'll post >>> truncate.patch and add reported-by and tested-by you, unless you have >>> objections. >> >> truncate.patch has been tested on two systems with default LRU scheme >> and the lockup due to inode->i_lock hasn't been seen yet after 24 hours run. > > Thanks. > >>> >>> The two paths below were contended on the LRU lock, but they already >>> batch their operations. So I don't know what else we can do surgically >>> to improve them. >> >> What has been seen with this workload is that the lruvec spinlock is >> held for a long time from shrink_[active/inactive]_list path. In this >> path, there is a case in isolate_lru_folios() where scanning of LRU >> lists can become unbounded. To isolate a page from ZONE_DMA, sometimes >> scanning/skipping of more than 150 million folios were seen. There is >> already a comment in there which explains why nr_skipped shouldn't be >> counted, but is there any possibility of re-looking at this condition? > > For this specific case, probably this can help: > > @@ -1659,8 +1659,15 @@ static unsigned long > isolate_lru_folios(unsigned long nr_to_scan, > if (folio_zonenum(folio) > sc->reclaim_idx || > skip_cma(folio, sc)) { > nr_skipped[folio_zonenum(folio)] += nr_pages; > - move_to = &folios_skipped; > - goto move; > + list_move(&folio->lru, &folios_skipped); > + if (spin_is_contended(&lruvec->lru_lock)) { > + if (!list_empty(dst)) > + break; > + spin_unlock_irq(&lruvec->lru_lock); > + cond_resched(); > + spin_lock_irq(&lruvec->lru_lock); > + } > + continue; > } Thanks, this helped. With this fix, the test ran for 24hrs without any lockups attributable to lruvec spinlock. As noted in this thread, earlier isolate_lru_folios() used to scan millions of folios and spend a lot of time with spinlock held but after this fix, such a scenario is no longer seen. However the contention seems to have shifted to other areas and these are the two MM related soft and hard lockups that were observed during this run: Soft lockup =========== watchdog: BUG: soft lockup - CPU#425 stuck for 12s! CPU: 425 PID: 145707 Comm: fio Kdump: loaded Tainted: G W 6.10.0-rc3-trkwtrs_trnct_nvme_lruvecresched #21 RIP: 0010:handle_softirqs+0x70/0x2f0 __rmqueue_pcplist+0x4ce/0x9a0 get_page_from_freelist+0x2e1/0x1650 __alloc_pages_noprof+0x1b4/0x12c0 alloc_pages_mpol_noprof+0xdd/0x200 folio_alloc_noprof+0x67/0xe0 Hard lockup =========== watchdog: Watchdog detected hard LOCKUP on cpu 296 CPU: 296 PID: 150155 Comm: fio Kdump: loaded Tainted: G W L 6.10.0-rc3-trkwtrs_trnct_nvme_lruvecresched #21 RIP: 0010:native_queued_spin_lock_slowpath+0x347/0x430 Call Trace: ? watchdog_hardlockup_check+0x1a2/0x370 ? watchdog_overflow_callback+0x6d/0x80 native_queued_spin_lock_slowpath+0x347/0x430 _raw_spin_lock_irqsave+0x46/0x60 free_unref_page+0x19f/0x540 ? __slab_free+0x2ab/0x2b0 __free_pages+0x9d/0xb0 __free_slab+0xa7/0xf0 free_slab+0x31/0x100 discard_slab+0x32/0x40 __put_partials+0xb8/0xe0 put_cpu_partial+0x5a/0x90 __slab_free+0x1d9/0x2b0 kfree+0x244/0x280 mempool_kfree+0x12/0x20 mempool_free+0x30/0x90 nvme_unmap_data+0xd0/0x150 [nvme] nvme_pci_complete_batch+0xaf/0xd0 [nvme] nvme_irq+0x96/0xe0 [nvme] __handle_irq_event_percpu+0x50/0x1b0 Regards, Bharata.