From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CA20C25B77 for ; Wed, 22 May 2024 05:38:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C75E6B0082; Wed, 22 May 2024 01:38:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 777106B0083; Wed, 22 May 2024 01:38:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63EA56B0085; Wed, 22 May 2024 01:38:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 41C936B0082 for ; Wed, 22 May 2024 01:38:03 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 7EA2F1213B8 for ; Wed, 22 May 2024 05:38:02 +0000 (UTC) X-FDA: 82144925604.14.980F7E9 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf01.hostedemail.com (Postfix) with ESMTP id 9AC8B40012 for ; Wed, 22 May 2024 05:38:00 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eFDz+O0W; spf=pass (imf01.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716356280; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A+HVAN6BzVPlHnPG0INZtHqTbC1FXGAbGa9Ct3lqJdo=; b=b/6ypel3/sc937TouJhLyG/pDkvw3mp9o4oVu8tsqtoe/oWyPKL3awm8TpaCCpFlMS592k IzvK155rWVf4igzZbRemt2bbCPkBwD6QaXtXwgnvOf4ZfNYbhP3f1T83OLCnQGVWkDaUS6 FNP+nSp3sdlDjh45Dq8eRcA7Xytl2Sc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716356280; a=rsa-sha256; cv=none; b=RmJ9bR03FAGHhoExTmCvnaKyrTfMF9cIEvYg6hv8XuSI+68zo3DhgMxp7gCJq+1J2vkdgy krmpzuGn1cHaQeDOR0jK7ibcyuVljWaOGcASBJxVjpGNLtbAxjf+sQplu2QN/fF5pGcZpq I7h3dYx3tHDwqKV1Ww7AN6PrjGpbpOw= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eFDz+O0W; spf=pass (imf01.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2e27277d2c1so79818491fa.2 for ; Tue, 21 May 2024 22:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716356278; x=1716961078; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=A+HVAN6BzVPlHnPG0INZtHqTbC1FXGAbGa9Ct3lqJdo=; b=eFDz+O0WcgQGO36a5T8i2NvWv5eYL/DATIcu9IoYjyf3YbpKeJI8AKq0WJm4jqX6v9 oVsmwe5sff88ZIuns5qdBYkVvnpwx6I3La+iudievTJZudBskf9KPka0sHR8z6x1Vwx1 VGTKpMIcCp3+S7eQKOjazQ8e1CEj43NatEOBPJeasF/bo4/yX1mUH3UCy8g2UyiCIaTg g5ZRogms8pyH6uGfXQ2+L7FM+3GrM3q0J6KgQm5WA1EuMvDg1ogVVnNHGn0dt9GbFJXt FGeMkOKQaGsYrpo5fflO7k4C4ReAEiGUs/B2HPPIemEqmRDyrQ6mzURaqjaPi0nT9Y3a vuKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716356278; x=1716961078; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A+HVAN6BzVPlHnPG0INZtHqTbC1FXGAbGa9Ct3lqJdo=; b=wFQ3nvakHOrapWX8BMGjdR06wTlh2JSZPJVk/8Ym7KiLQkNWhT4bpcD0FoKoV4BsBy jhSQYwJZpH1MISIwWMXQItp0ANIA59f7OunYMSYy/eegZ08Jc+GCUapb1k/G1SVJF+r4 jeSCUW7McRGBzUrLVd9rZwNmdhYFYYQuNk6LtEGBgVYM9lT8o4bhKevXhPNXUH2DQ1rx 9VKSEg9NDnYM4R+NJLOS2WB2AdmBgOUDKzQgG6At9ElVOJVarqDi2ZV0aIoHZfV6gld/ XPKDq7DodsW+9X/M97BmNnx+EqELQMLf/nR3taiY2P58wBapnwGOsJukBCRRbPYhOgaI Aj3Q== X-Forwarded-Encrypted: i=1; AJvYcCVmFINQ24c2pU7oBLIuX6yzWQgwLFKwtI4nOpJ8/ZUukU0L9jnIItO64hC76YBaGKJz1cioM2M2iymOvpp8R5zhrZk= X-Gm-Message-State: AOJu0YwERNHPjnDiTRfvhmuSuUPWyzXgD9GPgO5/GliZqlHZBNTHvQK0 wu4gWwR2l7BnbrxiwdeRtbrLPJZfGwuU3yIFlXeoQkj4Q7GMUtrJuUXNW7PfmvNXgzpX10d9dQa mKLjToedF8zrEcKU6lMTh5ueiMh8= X-Google-Smtp-Source: AGHT+IETkj3/Sw70uEPOnu1OWXXa+6Yv0OxgUJ6sufXAKQtRsOujf5WfGCqvRLwEruRToq65PwmNuzFY57OXnZ1JuY4= X-Received: by 2002:a05:6512:3607:b0:523:48f2:e3fd with SMTP id 2adb3069b0e04-526bdc53aa0mr370920e87.16.1716356278377; Tue, 21 May 2024 22:37:58 -0700 (PDT) MIME-Version: 1.0 References: <20240412064353.133497-1-zhaoyang.huang@unisoc.com> <20240412143457.5c6c0ae8f6df0f647d7cf0be@linux-foundation.org> <2652f0c1-acc9-4288-8bca-c95ee49aa562@marcinwanat.pl> In-Reply-To: <2652f0c1-acc9-4288-8bca-c95ee49aa562@marcinwanat.pl> From: Zhaoyang Huang Date: Wed, 22 May 2024 13:37:46 +0800 Message-ID: Subject: Re: [PATCH 1/1] mm: protect xa split stuff under lruvec->lru_lock during migration To: Marcin Wanat Cc: Dave Chinner , Andrew Morton , "zhaoyang.huang" , Alex Shi , "Kirill A . Shutemov" , Hugh Dickins , Baolin Wang , linux-mm@kvack.org, linux-kernel@vger.kernel.org, steve.kang@unisoc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: rii6eaigda1snmude9mrmddwxz8wu8xy X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9AC8B40012 X-HE-Tag: 1716356280-801971 X-HE-Meta: U2FsdGVkX1/0unG+eaEgr/+9WcqRppmYTs35nhfv1pnvb6FLQLcmy/U9buh3uYkPw8UE/8jTufJvAcI9nFMlpzgTiCJY7aC1QzTECQO6YpZuMcV9NyYUABHUUgbmbCdw2Z0HDvMie48eMdDDR2iJl2IxCDTerUYLZmwOaPxF5kU3Ml1jU8xn77Tkr8AGcpuLBKX1P7y9N+9Y10MANwe1cm1xmIKfZfrcH0jFWfg+HkEsJn9NaiQ6Vb7Syt/gq3HQv87wyc4adsygq7d0sb4Og7ET84ctw++jB/yftHA1baUab4jfjAHD32XYbOGu98XCe5NcSyJ3E9BhGASK34iiYwyJD/verV0gTj1QSz5wS1QdFoczv86B4XAiVdoWl7h0GCfqQJ37cqZz4t8ijYnSKwn8VtipbPgq5HPu91EJ7yozg8w136/sU7qGZognx2dhyR7Lig4saxAVM1kRoaB0PplDTNnPS24vh2jzQLccyEOmkAfX+O5trfgG+PEbvGYomTylMNOHgmPH5eAvRyUQmhjR2ToRpQIOB7lfU1yXQDrJ+B8kifkxG0OmdIxkhIgTccWY90yJ5UFcVLjV4N5LXWO+BaC7O9oU7z74DTzgvF1gpFj5JlAHD+TXph6St7GWYwQ8FgvvzpvGiytg+iyJ6fXFxy2oGPNrhKsBlpDnWxbATJ46GmU+ljnr5hFE1uawhC1ekJOe/nQlzNunaYsWxIMamZ5tphR5djffMe9vSrBaI8XdkkQiU313/h4Gl/pYSEU2Dk4HEudmuCID+yD9ubMkKM8w4Vi4UpqjznII8Z8JXpP+FSjljeHBJDHWRI8oqcHBYoqOW4leC38VETfSAygJtNsdwsny1mkq7PDDgYWubj2up6gpmEIeUGixYBL9Tb5NfnqYtSLHNeR/kvIf4rJ4geGIxuGUmNevT9Q1OFzA0LoE5ubzE3lSoCPzGBhQcpm44YS4eD2b8kBAXlB 1XvZSY6d XMi72NUQdXsnCJdP6PZY20mDAaxZvruwBoIxy+J1+qsc7Xj1cI0AV9INaRP0WrD+4uBOavpCn04yJU+YTZnOmbvbt8LlI9UtkEvzEPNB5yiub+QM+I2jMkXEcclvjoFuJ4rOZWjM5VbCEHqBeEB6fzfMptP1TKm5CmW/eTevJH4pOCN3jLJEpH+Cu6Z2tFbwAKxHWoLQJ81FM9ayYwg6bSprlJ5MQZRtR4p3JRBHMhnJ4agxsqU+g2+OMzwlyiiHmVoHG+7W6xnk2ozYr4CS741P7mvXDW0oXGr3NJT2yI2nDQbw6poJ/mW9ZlYaGVmcOLkaPpCEKdmvZM44mYu2HPMx3IGzliQ281wuem8CsyuVmkmhN6eb5BYB0Z/JNyXcsPFu2j6m790+yMQHvt4gVDmD5lq3697kQH6KZqF1nZqz45eBbWBG1dmcPUcAcJG7DLVH7HSFVEE+FahbvcN09N0aOFUwmXwdRN8imnazzfN/MS1xP6Tv3hrGCw85K289ZOTT0C29PRU0qgtXmewkKYy40oMSy1p+TDFn+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 21, 2024 at 11:47=E2=80=AFPM Marcin Wanat wrote: > > On 21.05.2024 03:00, Zhaoyang Huang wrote: > > On Tue, May 21, 2024 at 8:58=E2=80=AFAM Zhaoyang Huang wrote: > >> > >> On Tue, May 21, 2024 at 3:42=E2=80=AFAM Marcin Wanat wrote: > >>> > >>> On 15.04.2024 03:50, Zhaoyang Huang wrote: > >>> I have around 50 hosts handling high I/O (each with 20Gbps+ uplinks > >>> and multiple NVMe drives), running RockyLinux 8/9. The stock RHEL > >>> kernel 8/9 is NOT affected, and the long-term kernel 5.15.X is NOT af= fected. > >>> However, with long-term kernels 6.1.XX and 6.6.XX, > >>> (tested at least 10 different versions), this lockup always appears > >>> after 2-30 days, similar to the report in the original thread. > >>> The more load (for example, copying a lot of local files while > >>> serving 20Gbps traffic), the higher the chance that the bug will appe= ar. > >>> > >>> I haven't been able to reproduce this during synthetic tests, > >>> but it always occurs in production on 6.1.X and 6.6.X within 2-30 day= s. > >>> If anyone can provide a patch, I can test it on multiple machines > >>> over the next few days. > >> Could you please try this one which could be applied on 6.6 directly. = Thank you! > > URL: https://lore.kernel.org/linux-mm/20240412064353.133497-1-zhaoyang.= huang@unisoc.com/ > > > > Unfortunately, I am unable to cleanly apply this patch against the > latest 6.6.31 Please try below one which works on my v6.6 based android. Thank you for your test in advance :D mm/huge_memory.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 064fbd90822b..5899906c326a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2498,7 +2498,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, { struct folio *folio =3D page_folio(page); struct page *head =3D &folio->page; - struct lruvec *lruvec; + struct lruvec *lruvec =3D folio_lruvec(folio); struct address_space *swap_cache =3D NULL; unsigned long offset =3D 0; unsigned int nr =3D thp_nr_pages(head); @@ -2513,9 +2513,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, xa_lock(&swap_cache->i_pages); } - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ - lruvec =3D folio_lruvec_lock(folio); - ClearPageHasHWPoisoned(head); for (i =3D nr - 1; i >=3D 1; i--) { @@ -2541,9 +2538,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, } ClearPageCompound(head); - unlock_page_lruvec(lruvec); - /* Caller disabled irqs, so they are still disabled here */ - split_page_owner(head, nr); /* See comment in __split_huge_page_tail() */ @@ -2560,7 +2554,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, page_ref_add(head, 2); xa_unlock(&head->mapping->i_pages); } - local_irq_enable(); if (nr_dropped) shmem_uncharge(head->mapping->host, nr_dropped); @@ -2631,6 +2624,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) int extra_pins, ret; pgoff_t end; bool is_hzp; + struct lruvec *lruvec; VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); @@ -2714,6 +2708,14 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) /* block interrupt reentry in xa_lock and spinlock */ local_irq_disable(); + + /* + * take lruvec's lock before freeze the folio to prevent the folio + * remains in the page cache with refcnt =3D=3D 0, which could lead to + * find_get_entry enters livelock by iterating the xarray. + */ + lruvec =3D folio_lruvec_lock(folio); + if (mapping) { /* * Check if the folio is present in page cache. @@ -2748,12 +2750,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) } __split_huge_page(page, list, end); + unlock_page_lruvec(lruvec); + local_irq_enable(); ret =3D 0; } else { spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) xas_unlock(&xas); + + unlock_page_lruvec(lruvec); local_irq_enable(); remap_page(folio, folio_nr_pages(folio)); ret =3D -EAGAIN; --=20 2.25.1