From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69B58C54FB3 for ; Mon, 26 May 2025 12:58:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0BF5E6B0089; Mon, 26 May 2025 08:58:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 096856B008A; Mon, 26 May 2025 08:58:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F15C66B008C; Mon, 26 May 2025 08:58:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D42806B0089 for ; Mon, 26 May 2025 08:58:10 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 81E4D8042B for ; Mon, 26 May 2025 12:58:10 +0000 (UTC) X-FDA: 83485061940.03.2D6DF4A Received: from m16.mail.126.com (m16.mail.126.com [117.135.210.7]) by imf17.hostedemail.com (Postfix) with ESMTP id 772BF40005 for ; Mon, 26 May 2025 12:58:07 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=XU7TKfiF; spf=pass (imf17.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748264288; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RLHE9Le6NF75ZsPlx3fNwcmUymScAZDzdvtqbAL6Mfs=; b=N2gAhsdVd3Bi2RziiBre+xvjWQgXvTPZakPn/VgfSEVomUgsf+SBOxOB/d7rqnhsFa8QFk iKDlHopvo7nSqoOgaXFkWrWtPiIonIJ1qRlFr2JwWOp8Orm/vey0PA4iydSodyqRXp6bg3 S9DpKHQasd1RJosBfhEPc7i0oYfu8Og= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=XU7TKfiF; spf=pass (imf17.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748264288; a=rsa-sha256; cv=none; b=PYi44VNGlpVBY1kWVXkl9P0OPIXC32gulXxuZ+yvI7UkAhIUeh9LREvpDFqGFO+A5NA7Uk mPzKgAOCFw7Qz2FYnNTZ7l4wgEEyPTDyFQmVf0jZrvWvUCYD7+w723fyxRUi3SoUbGyD6H nMlxsDCNkWpxzTG5YXl8oW290lebu/0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:To:From: Content-Type; bh=RLHE9Le6NF75ZsPlx3fNwcmUymScAZDzdvtqbAL6Mfs=; b=XU7TKfiFiuL1fGPhY6ABmQ6j+7wDr56dmQAeGwYNvHSP8vqxxNDluffM5TcIDg el+lujO2r+3L0HCuO1nOxifO/XCwKQkyh1RVGL/ZqCXQ00xrUyvVzTgYmaB6Lzl1 NIkl4MKq2aNvp1eOH7QXAwmc7HeX6UppGTzsnSPpy59PU= Received: from [172.19.20.199] (unknown []) by gzga-smtp-mtada-g1-3 (Coremail) with SMTP id _____wD3H+RWZTRoPeG8Ag--.41321S2; Mon, 26 May 2025 20:57:59 +0800 (CST) Message-ID: <07b7d4fd-c600-4de1-aea4-037e148da79b@126.com> Date: Mon, 26 May 2025 20:57:58 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios To: David Hildenbrand , akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, baolin.wang@linux.alibaba.com, muchun.song@linux.dev, osalvador@suse.de, liuzixing@hygon.cn References: <1747884137-26685-1-git-send-email-yangge1116@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wD3H+RWZTRoPeG8Ag--.41321S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxJF43Jw15WryxGFWxWr18Grg_yoWrCry7pr yUGFs8KrWkJryDAr4xJr15Jr1qkrWqvF48XFWxKrnrAFnxJw1DGr1qqr1jqa1rArZ7JF4x tF4vqa1vvF1UGaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07jjoGQUUUUU= X-Originating-IP: [112.64.138.194] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiigJZG2g0ZJcRGgAAsu X-Rspamd-Queue-Id: 772BF40005 X-Stat-Signature: yc1ohh7doj11t9o9m8zwgoauu8ocqceq X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1748264287-455660 X-HE-Meta: U2FsdGVkX1+RbVI6dOkA/A7+OjI+rUj7a7CidvYy/CTkMaMhDjzNI20P5Fzx15/vohoBGSXo1xBeCYkood+6mBvu2Qn5POlSa/P7x+uNK6X7WQsAOeOv8oWcjlF7+t6yymBhj3vJCdCg2uZzsxM3QGqoNja8geG8xog4FmPHjgnFf/deWfUprX6QF3BOktjztzYTa5TiC++BUPT+Z559wvTXSBTs2PYWFJwOzecmi+ZPNwMlQ+Usk/JdExgS28fFwIAWjd520Z1wtPylOnIBOHAixqnaynjFosorsz9gAzmW4DrjChKJiueWQfSzkpx37qaeak8vTIwtO60jcy7MVkqL4l0jxQViCpCXGfw3pqhDSr0ujpdSC+cvf51ygB8lV6CVbDiEU5PKQiAqlVH8bql0O8xW2a9vIKwEC2aJdTvVMZJE76f62xTa6TgSum0k7PD7CqXu3ODEZPjPHb0b6Ts8dyM51ccKUMGZl7BA6F7+KprmliOrxa207lj2IOig+vJYMix5niw+i0aLLLE46/tSf2BOl3o6YoXLSHJf/qginJl+EnFLwuPvDNzxwF5YPITLdURY9iyNN2jTnlC0cZFbHRdWT06wuwgI4PwaH/bgOIBP1MJxDe46S/IqJgbNUB89P9/TLTYn8MaWOQnnMo0Z9qeUga/yICIIAfEa4vtfMt1WkkZQ9Wt4N1OTPjLQGK56sk7Z4/kVI/dKCSnVpg4zWNp+nQj9m3H3To07lGkCa8wFJGPeRHgLPq5p51Drx7/lKK/wXhU9cvDKxkQ7QlJ7PjvR4I/4kmWw5O8sonzBcNMIybOOYTEpEZ28hVstzgcW5cVuJrxBRZUYHQ+8yfrtZ8299s3M5SHP/riDDCi49k55Nu7VHm16Rjt1jRXxMsTp+vt2BCjvr44awAegYGBs6EhRG1yvO4U+96WvWbhUexIGux2HCJbOSBMdoAuZfqccCw9TqTWcMG86wOm q0k0/NWj qEndvRyJnsyvUNIzn8XLtHux8/FMrl/WaE8/xZpqTJxrqzgOZnBL1GlgajZ5VNY/5DPiIgRaFYI4UcHmKjfTEdd6BHB61aIPxj+1Dy1xfvBx5FdTrrAXopUFCObi57se4R1gC/sLEQ8vYvegyQ9d+cOnsX7CSzzlWeML20j8qF+LZwhRagfwqhAhSCmQhQghoCHDNWKuf2vuH7fBmnWgj7K3hm8ILXSfDs+cRvRH6ifLxwsGOLq7nataPuHa7n7ycQMlYYHwwixLVbkJl42lIcF72bLwv0UpS84t4x2IdYD+pZBW1sgJw+p8PVwPzj7be7CiyWXZRgIWZ8xWJpYC9+w2PT5RhdcM8bN6G8aXoqiMR79x/Cehtwfu62i8QPdJBUP0wiz8Vr70FktK/QDgFGH7y8+PgkDiwsGQCd1fzdq2bLVyoZxi+/enQ+hjU6Kt4f13fiu1D78jTsxs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/5/26 20:41, David Hildenbrand 写道: > On 22.05.25 05:22, yangge1116@126.com wrote: >> From: Ge Yang >> >> A kernel crash was observed when replacing free hugetlb folios: >> >> BUG: kernel NULL pointer dereference, address: 0000000000000028 >> PGD 0 P4D 0 >> Oops: Oops: 0000 [#1] SMP NOPTI >> CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 >> PREEMPT(voluntary) >> RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0 >> RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286 >> RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000 >> RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000 >> RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000 >> R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000 >> R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004 >> FS:  00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) >> knlGS:0000000000000000 >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0 >> Call Trace: >> >>   replace_free_hugepage_folios+0xb6/0x100 >>   alloc_contig_range_noprof+0x18a/0x590 >>   ? srso_return_thunk+0x5/0x5f >>   ? down_read+0x12/0xa0 >>   ? srso_return_thunk+0x5/0x5f >>   cma_range_alloc.constprop.0+0x131/0x290 >>   __cma_alloc+0xcf/0x2c0 >>   cma_alloc_write+0x43/0xb0 >>   simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110 >>   debugfs_attr_write+0x46/0x70 >>   full_proxy_write+0x62/0xa0 >>   vfs_write+0xf8/0x420 >>   ? srso_return_thunk+0x5/0x5f >>   ? filp_flush+0x86/0xa0 >>   ? srso_return_thunk+0x5/0x5f >>   ? filp_close+0x1f/0x30 >>   ? srso_return_thunk+0x5/0x5f >>   ? do_dup2+0xaf/0x160 >>   ? srso_return_thunk+0x5/0x5f >>   ksys_write+0x65/0xe0 >>   do_syscall_64+0x64/0x170 >>   entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> There is a potential race between __update_and_free_hugetlb_folio() >> and replace_free_hugepage_folios(): >> >> CPU1                              CPU2 >> __update_and_free_hugetlb_folio   replace_free_hugepage_folios >>                                      folio_test_hugetlb(folio) >>                                      -- It's still hugetlb folio. >> >>    __folio_clear_hugetlb(folio) >>    hugetlb_free_folio(folio) >>                                      h = folio_hstate(folio) >>                                      -- Here, h is NULL pointer >> >> When the above race condition occurs, folio_hstate(folio) returns >> NULL, and subsequent access to this NULL pointer will cause the >> system to crash. To resolve this issue, execute folio_hstate(folio) >> under the protection of the hugetlb_lock lock, ensuring that >> folio_hstate(folio) does not return NULL. >> >> Fixes: 04f13d241b8b ("mm: replace free hugepage folios after migration") >> Signed-off-by: Ge Yang >> Cc: >> --- >>   mm/hugetlb.c | 8 ++++++++ >>   1 file changed, 8 insertions(+) >> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index 3d3ca6b..6c2e007 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -2924,12 +2924,20 @@ int replace_free_hugepage_folios(unsigned long >> start_pfn, unsigned long end_pfn) >>       while (start_pfn < end_pfn) { >>           folio = pfn_folio(start_pfn); >> + >> +        /* >> +         * The folio might have been dissolved from under our feet, >> so make sure >> +         * to carefully check the state under the lock. >> +         */ >> +        spin_lock_irq(&hugetlb_lock); >>           if (folio_test_hugetlb(folio)) { >>               h = folio_hstate(folio); >>           } else { >> +            spin_unlock_irq(&hugetlb_lock); >>               start_pfn++; >>               continue; >>           } >> +        spin_unlock_irq(&hugetlb_lock); > > As mentioned elsewhere, this will grab the hugetlb_lock for each and > every pfn in the range if there are no hugetlb folios (common case). > > That should certainly *not* be done. > > In case we see !folio_test_hugetlb(), we should just move on. > The main reason for acquiring the hugetlb_lock here is to obtain a valid hstate, as the alloc_and_dissolve_hugetlb_folio() function requires hstate as a parameter. This approach is indeed not performance-friendly. However, in the patch available at https://lore.kernel.org/lkml/1747987559-23082-1-git-send-email-yangge1116@126.com/, all these operations will be removed.