From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 856E2C36002 for ; Tue, 25 Mar 2025 03:02:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27763280002; Mon, 24 Mar 2025 23:02:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2257B280001; Mon, 24 Mar 2025 23:02:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0EEFF280002; Mon, 24 Mar 2025 23:02:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E4471280001 for ; Mon, 24 Mar 2025 23:02:39 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9F903AC3A1 for ; Tue, 25 Mar 2025 03:02:39 +0000 (UTC) X-FDA: 83258575638.19.AAE359E Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf18.hostedemail.com (Postfix) with ESMTP id DBC8E1C0007 for ; Tue, 25 Mar 2025 03:02:36 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf18.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1742871757; a=rsa-sha256; cv=none; b=Q5M/vqDYekkaHguvXOi0BynmQucQZa8Fp5JgVcdYYVeF8wRl8uJIgyuRLaObooqg9FMkAd uVi7CTXIsLdYR0kRphdHoHR5oGYFo2FkJKdQo9sF5YJ7O8Lm86IcTU5MH/UpxdZJ5ZFuBa gcrWJGHghSYGPsegHyAHHeOMNsSi7H8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf18.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1742871757; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=onDDb3XJbtajWoK13U5JYPf74E71t9h6MaY1o7lkreA=; b=rSeg4qKszbZmgA32wYLFnHRpslLQVeptFsbFCeXpLECVJA2QBNlib25YDzC7zO0vjaieiq Rt+MOnRmTGCcjkQxFKz/2w85tf/FBgAcN7c64oHecjhPgZucUQvwg1DOe5tMEULOJL3ITw cKX02hbDsj11eBm+FWSCrtDrjGda/rA= Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4ZMF8321Gpz13KwX; Tue, 25 Mar 2025 11:02:11 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id 4DF9F1401F4; Tue, 25 Mar 2025 11:02:32 +0800 (CST) Received: from [10.174.179.13] (10.174.179.13) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 25 Mar 2025 11:02:31 +0800 Message-ID: <68ab727b-dc3d-327f-33b6-25bbfce8530e@huawei.com> Date: Tue, 25 Mar 2025 11:02:30 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range To: David Hildenbrand , , , , CC: , , References: <20250324131750.1551884-1-tujinjiang@huawei.com> <899807c3-931f-43e6-bf3e-188787a4205a@redhat.com> From: Jinjiang Tu In-Reply-To: <899807c3-931f-43e6-bf3e-188787a4205a@redhat.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.179.13] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemo200002.china.huawei.com (7.202.195.209) X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DBC8E1C0007 X-Stat-Signature: h3d999tyu7hzri3pqy73by44u6fppcit X-Rspam-User: X-HE-Tag: 1742871756-648008 X-HE-Meta: U2FsdGVkX1/3Wg/7vwhY6JEdyco48RvpsbXkPML7dhV032vQJdWfxUwaWbKUL59sfuvQSISq7xdnx4et0DBtIi2aaBUpuvSRpp7ang/j+4f/dTo2dNCsf5EFm00jrxOUeVU6M3CdA0gwTnUb9wGa6Hrza9NeU25wnQ2dSo96M+J8E+tXlm7uDONtGC258LoMczC8ZPa5cZmk+W3BHKO6+fsxr3q3/v4ifuEO0CIhH6S1KoRufksxbcpf4egCzRzysEyXM9LSCiX8GHoSEyxjcSaC6CCepUZs0cHtNpql7XCgXXVvXO3c+226H4uqOqvklP7qDB7+98EsturFMSZIpDcbUbTlq1D/ekepoIh3GcqAZRySmXk+RTVmsZ1RY6NR0Jc9Dq1xqpWjpdE30Tmb5OCJPOf48bNOeA7EJkzJNOR5jw6CWE0u0DQMVewo8OCnZutMhsBB2u4y2yJoAkmZzlNCbX0L0EORnw8STX4MouiHO/9rRKIoPDSTknDtQxasn1B6Bwt+5ISisZbc53ITW55ACqHsbpyMj5gaFRlXPU/TcqNO6eZR3Y5TLGXHGhCp6JxZo3tNWys3G6ueyj8g8EhUw/VLvRH3AItX9pIx94/rQ0uv/oWJ2YFGX//Me7xoaRT5mkxmJpf+sD5jtiGuys74oG7OHgoYP0665FD38s4fvPPUW2nN3EF9T5PqQbfXkceMaxQ2QecAipdG+RsP34OFercUw87ZOfTnroQTHA1TvX5eBupVB3/vcfzyCg+ToAnzEpkMYQ4i8g3Ctl6YSo/F8SVnghjtMAaWSgWvi+5ohwAuf0iBYjrvLYGbSz96dk7JGb0OWOwAgLPJ+088ji+UUG2YuaSMVSLjV23euYxbrA4aTP+wXj5wATedcgYz7HXZj1RfjdxeWNqDgzTOJK4ztR5kckuD/46fcH+fOP+EShWk/EX7a3D49B7CuXN45Qu/t+wRRMt8Zp6Z8J2 IVACXm/k 7cY+vL6KT7jixttI4h5eAd6+TRxWAPbSWVMPAkRWApFEr+WG1LTC8Yx12YF2f2vPRVeWWD5oyQVRNZHscMia1uYBCAiSwCbAgCDw13B1ox4SyrCFvX2n71Sb0iLA267qASFX8kWvJ1Zu0b6Klakh4H6kJm6DcwZe91eTEE82aIDyZJYNc6OT6rb7KUGnCGvOkOA3cjs5jglMD8o79Nwm+ae7Ex1MKV46F0ef9hrwRE734mHB/9smONuh6qhcbYwL00rKSz31cnpI9NDTGdUUPgzojlaoMg4K0i0ikGspm27E30Ss= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/3/24 21:44, David Hildenbrand 写道: > On 24.03.25 14:17, Jinjiang Tu wrote: >> We triggered the below BUG: >> >>   page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x2 >> pfn:0x240402 >>   head: order:9 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 >> pincount:0 >>   flags: 0x1ffffe0000000040(head|node=1|zone=3|lastcpupid=0x1ffff) >>   page_type: f4(hugetlb) >>   page dumped because: VM_BUG_ON_PAGE(page->compound_head & 1) >>   ------------[ cut here ]------------ >>   kernel BUG at ./include/linux/page-flags.h:310! >>   Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP >>   Modules linked in: >>   CPU: 7 UID: 0 PID: 166 Comm: sh Not tainted 6.14.0-rc7-dirty #374 >>   Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 >>   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>   pc : const_folio_flags+0x3c/0x58 >>   lr : const_folio_flags+0x3c/0x58 >>   Call trace: >>    const_folio_flags+0x3c/0x58 (P) >>    do_migrate_range+0x164/0x720 >>    offline_pages+0x63c/0x6fc >>    memory_subsys_offline+0x190/0x1f4 >>    device_offline+0xc0/0x13c >>    state_store+0x90/0xd8 >>    dev_attr_store+0x18/0x2c >>    sysfs_kf_write+0x44/0x54 >>    kernfs_fop_write_iter+0x120/0x1cc >>    vfs_write+0x240/0x378 >>    ksys_write+0x70/0x108 >>    __arm64_sys_write+0x1c/0x28 >>    invoke_syscall+0x48/0x10c >>    el0_svc_common.constprop.0+0x40/0xe0 >> >> When allocating a hugetlb folio, between the folio is taken from buddy >> and prep_compound_page() is called, start_isolate_page_range() and >> do_migrate_range() is called. When do_migrate_range() scans the head >> page >> of the hugetlb folio, the compound_head field isn't set, so scans the >> tail page next. And at this time, the compound_head field of tail >> page is >> set, folio_test_large() is called by tail page, thus triggers >> VM_BUG_ON(). >> >> To fix it, get folio refcount before calling folio_test_large(). >> >> Fixes: 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports >> thp migration") >> Signed-off-by: Jinjiang Tu >> --- >>   mm/memory_hotplug.c | 12 +++--------- >>   1 file changed, 3 insertions(+), 9 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index 16cf9e17077e..f600c26ce5de 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1813,21 +1813,15 @@ static void do_migrate_range(unsigned long >> start_pfn, unsigned long end_pfn) >>           page = pfn_to_page(pfn); >>           folio = page_folio(page); >>   -        /* >> -         * No reference or lock is held on the folio, so it might >> -         * be modified concurrently (e.g. split).  As such, >> -         * folio_nr_pages() may read garbage.  This is fine as the >> outer >> -         * loop will revisit the split folio later. >> -         */ >> -        if (folio_test_large(folio)) >> -            pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1; >> - >>           if (!folio_try_get(folio)) >>               continue; >>             if (unlikely(page_folio(page) != folio)) >>               goto put_folio; >>   +        if (folio_test_large(folio)) >> +            pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1; > > Moving that will not make it able to skip the large frozen > (refcount==0, e.g., free hugetlb) folio in the continue/put_folio case > above. Hmmmm .. For free hugetlb, pfn is increased by 1 in each loop. This leads to skip free hugetlb slower. > > We could similarly to dumping folios, snapshot them, so we can read > stable data. extract the code in __dump_page()? But snapshot may lead to do_migrate_range() slower too.