From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BED8AC25B7A for ; Wed, 22 May 2024 10:13:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36DFA6B008A; Wed, 22 May 2024 06:13:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31E826B008C; Wed, 22 May 2024 06:13:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E6AF6B0093; Wed, 22 May 2024 06:13:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 014286B008A for ; Wed, 22 May 2024 06:13:23 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 87133140130 for ; Wed, 22 May 2024 10:13:23 +0000 (UTC) X-FDA: 82145619486.03.5914735 Received: from mx.bitactive.com (mx.bitactive.com [178.32.63.155]) by imf20.hostedemail.com (Postfix) with ESMTP id 433B31C001D for ; Wed, 22 May 2024 10:13:20 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=marcinwanat.pl header.s=s2048 header.b=Szdx9Hcu; dkim=pass header.d=marcinwanat.pl header.s=s2048 header.b=I6xvTTPg; dmarc=pass (policy=reject) header.from=marcinwanat.pl; spf=pass (imf20.hostedemail.com: domain of private@marcinwanat.pl designates 178.32.63.155 as permitted sender) smtp.mailfrom=private@marcinwanat.pl ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716372800; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vwogVyT1ERaJqCbtt35UMJYwBhprISHU+UzKOErpRZI=; b=AWfFBzPloBrPmieBfi9EP1KHfvy1AXkNH1WuJiyngf5zBTIpw2484vAuAbLSq6Oj7oBbHG 6WFm/Dp3hevnjq0Oq7kxqYBrd2k8oGnUZiO04y9yztv89PpN6rcPY1f0wOtlmaoAk1Oh0N YLf5tqmKlMaR23xQop+mfP21Wrd54GI= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=marcinwanat.pl header.s=s2048 header.b=Szdx9Hcu; dkim=pass header.d=marcinwanat.pl header.s=s2048 header.b=I6xvTTPg; dmarc=pass (policy=reject) header.from=marcinwanat.pl; spf=pass (imf20.hostedemail.com: domain of private@marcinwanat.pl designates 178.32.63.155 as permitted sender) smtp.mailfrom=private@marcinwanat.pl ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716372800; a=rsa-sha256; cv=none; b=xrt658GHDKdFHOXju07IIFXe/LB/jCNGAGAhwe1cARbd/mCjs9hG4Spnpp2TfTKGx1mc9Q 7yoxz4n783PvPmitFCP+TtI/Tmd8ermrpfz8ZolxhCg+9gf9pfOm0qxoX1gAy+edu5SLv5 XLUIb7anQfLkpNP06o1WqpiK+JadToM= Received: by mx.bitactive.com (Postfix, from userid 1044) id 4VknFB5T67z7mgZ; Wed, 22 May 2024 12:13:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marcinwanat.pl; s=s2048; t=1716372798; bh=vwogVyT1ERaJqCbtt35UMJYwBhprISHU+UzKOErpRZI=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=Szdx9HcuIAjsGEndkgi2WlCbsEXZvkq3oY0khWjY0dsugi9hKiS1COFzFmTbhapZE aVtvczx6EFRNcvBX9Pu9qwOq8Lxfg93M4iNAbE07dzkTlw4z4TZ55QTZV7DAIGGZli KKtxYFm1zNH0v78EAQIHo6BBDNRHCvZd7J2XE0r2L54K+1aHRb4fLYxThuRCX1dafq IHyhitugBSgAjNW6lK3TxmLSwO546XMPsVMkRDkQZSsRAAmDYRlzBLZ50gxb6THNqN 6PGJFcEvxbCZyn58idc7+iz5NXJBLTBlL3BGFi5UHe8DXYjLDBW4F5OqZlxBi7Yly/ HycAILt7ngZ7Q== Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marcinwanat.pl; s=s2048; t=1716372794; bh=vwogVyT1ERaJqCbtt35UMJYwBhprISHU+UzKOErpRZI=; h=Date:Subject:To:Cc:References:From:In-Reply-To; b=I6xvTTPgd8AbX5YRJV4/QLxDN2cp6/xs9z/YtVk3Dfl+0zUUWh61BNanaJ01aAKqJ YEVB27YWFa1YLIfKpMAuvauphmQUHWoQd5L/ftsOxlF7NPU46SwGUjVM1vK+O5vdY5 qwhRmLwq1W6CL5dxcpT+TdV/Y5VDX3+AWfLEpSLmSDAi7Zc3UpckjDHdCvdqFO+nnB iZmoVozdyzkZGrMF9QMK6geNKj+pBZMqTrcVGkn1kCAkYHQCUipui6IJKNWnBzqtwb fRzPRGDYSUoRNBicZ/+d5y5TnbizkWChYSIH7xJax9bf3AVyNMZOfEYRRSjERfB7L2 Ntqptwirdv4qw== Date: Wed, 22 May 2024 12:13:12 +0200 Subject: Re: [PATCH 1/1] mm: protect xa split stuff under lruvec->lru_lock during migration To: Zhaoyang Huang Cc: Dave Chinner , Andrew Morton , "zhaoyang.huang" , Alex Shi , "Kirill A . Shutemov" , Hugh Dickins , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, steve.kang@unisoc.com References: <20240412064353.133497-1-zhaoyang.huang@unisoc.com> <20240412143457.5c6c0ae8f6df0f647d7cf0be@linux-foundation.org> <2652f0c1-acc9-4288-8bca-c95ee49aa562@marcinwanat.pl> Content-Language: en-US From: Marcin Wanat In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: clamav-milter 0.103.6 at mx.bitactive.com X-Virus-Status: Clean X-Rspamd-Queue-Id: 433B31C001D X-Stat-Signature: npjfjm1cciqxkar47ep3gficm358zfpm X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1716372800-588843 X-HE-Meta: U2FsdGVkX19uZ1PtS2dY7heDIbR3H4B/A3fEfzAXReUn6Cu8tPHBdt/PRmfCCO3NowO7L6M+WbGn8/zKe1eMtgyLjsVGchtJxEEEgBCubT1WFjmAuM2PoFxAUmPz6jnCRMTSKw8R5UQjLROOsalQrk6iANqKsfc9oZphQF2KdqWm+ajLvIWYm3noXdvnVHn28rMFcO/bD+wHjQWVlk++JuJ+ingARs/7OEuZsmEMJAbZvtGCVY1zNuvQYsZZHbysJQZg83iSRNKpMDRkLrsgACGLuJ3Z0Yr2zEoBAQw08D0X1UkCfT5UtAGXd6B28JM+G/JnlOvHC2rMlNvrWq7g680uILGds8YRY3dG3oSTcf+pjUY8AylVCR0xHg/Jk+GstPWLE3BOBG2Q2wXo4RueSY4VP1qWs7+oGuX734cg1IIvlmvamkVSM6GnJMr140TfxMgv1DYvTb+Hy1HY02N4Z6ihRL6tfvDC4fc0DD43RwOZ8mrWuuEm0J7xH3ygYtzZYxTeZognUVWuYGgEGyCBkbSUjUF8tpXYG4Nh8ztGUktlumMBGbKQOlRJuc9RHgTzAGvisR5ePxkJqu0LgVIqE+55F6Wg0AxcTmAiav4qE3ES8y4m+Rm7VKQpvakfwDWqrJfNSyWqe7Yk3WQI15s0Zgh3plLo5eLPZRtSDQCO+ivFnMVK7Jln3D/RaB60RHGLDlopxFKPNTwMSRP6MUMY3T2i7a1LYvnrxyBvCUpp5tJK3gQf9pRNIBeCfVMKw7s6wUwAsks7cO716h8gDsPCtGiZvO1QJ9EZyBq7t4r0ZApNgId91ale8jzcviFjiYoorN5x/vm8qg+wln65ALhH8gpc3EqJ8JSChK3N1cX8OjqaTbBal1jl0D9S05CZGbG5yu6eVhCz/8Qgb2QeY5d5FsXOnSb3NWx8cYvy6+bF64svJWFIJhvn2rEgOZPFO4q+FlC1NinxiHY9qDjwT+1 dhfdut3n v+SDbxZc/3HDg5aOm2Qa/ZH+dBqi5f9ae3MAUUZh6ZKz86xqbJmO1XMHhyLO6kDgJ5vw0b1W8q1bMmsDEQyKwAx66lmFqfM7KDWs2KxPcB49QBq8RiYyaXICI0TOnRb0kVwXaZnDNThRph3kHLPegAbB3R3pGzdpFGtrwC3TEYC0UDgEoK1UmFRktrK6qpp+YyVoRno8GYHzMA6iZJffvFH0FyiZO2hm8/rzIs6gkiJqSZ6iv4XiFXE7lMQ/iXPXNSv7adHKUNXl2AHFImetizeLjH22CnwaXOMk8dnZMAxVapZwGl6ukA22ih3rxS9xOeyVY3DDEe2D84lYMA21lFE4GM3VYvPKGF0BJvknKiEtxMhrbYZR/3e42mE7dU5NEK2Yr5gMOdePHooDt4wLDufjWzXlpiTF/h8EkQ15lLThDy6c= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 22.05.2024 07:37, Zhaoyang Huang wrote: > On Tue, May 21, 2024 at 11:47 PM Marcin Wanat wrote: >> >> On 21.05.2024 03:00, Zhaoyang Huang wrote: >>> On Tue, May 21, 2024 at 8:58 AM Zhaoyang Huang wrote: >>>> >>>> On Tue, May 21, 2024 at 3:42 AM Marcin Wanat wrote: >>>>> >>>>> On 15.04.2024 03:50, Zhaoyang Huang wrote: >>>>> I have around 50 hosts handling high I/O (each with 20Gbps+ uplinks >>>>> and multiple NVMe drives), running RockyLinux 8/9. The stock RHEL >>>>> kernel 8/9 is NOT affected, and the long-term kernel 5.15.X is NOT affected. >>>>> However, with long-term kernels 6.1.XX and 6.6.XX, >>>>> (tested at least 10 different versions), this lockup always appears >>>>> after 2-30 days, similar to the report in the original thread. >>>>> The more load (for example, copying a lot of local files while >>>>> serving 20Gbps traffic), the higher the chance that the bug will appear. >>>>> >>>>> I haven't been able to reproduce this during synthetic tests, >>>>> but it always occurs in production on 6.1.X and 6.6.X within 2-30 days. >>>>> If anyone can provide a patch, I can test it on multiple machines >>>>> over the next few days. >>>> Could you please try this one which could be applied on 6.6 directly. Thank you! >>> URL: https://lore.kernel.org/linux-mm/20240412064353.133497-1-zhaoyang.huang@unisoc.com/ >>> >> >> Unfortunately, I am unable to cleanly apply this patch against the >> latest 6.6.31 > Please try below one which works on my v6.6 based android. Thank you > for your test in advance :D > > mm/huge_memory.c | 22 ++++++++++++++-------- > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c I have compiled 6.6.31 with this patch and will test it on multiple machines over the next 30 days. I will provide an update after 30 days if everything is fine or sooner if any of the hosts experience the same soft lockup again.