From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06153C54756 for ; Thu, 22 May 2025 03:47:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55F7D6B0082; Wed, 21 May 2025 23:47:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 50FF06B0083; Wed, 21 May 2025 23:47:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 427A06B0085; Wed, 21 May 2025 23:47:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 22EEE6B0082 for ; Wed, 21 May 2025 23:47:56 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 709E3C2130 for ; Thu, 22 May 2025 03:47:55 +0000 (UTC) X-FDA: 83469160110.08.FC7CFB6 Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) by imf10.hostedemail.com (Postfix) with ESMTP id 79AEDC0007 for ; Thu, 22 May 2025 03:47:53 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HFdc255G; spf=pass (imf10.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747885673; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BXXYi1vVEw7Vh0KRmb1QEzzCMPaHpSE9E8XA1i/G0ww=; b=dB1IuAkmGk1m3Ia860aWUcG203o41RAxDDl03dDOjWkJwXYV/fuHUbjtajPdeEwdHIJacC tk6rpjzQqFbs8PiaINgK0KK5caXeG+bXap3HNtOB+ALBOp6hzVtD7HhVLw/sZ7QdEB3AYT P2SJtTx01Gmkf5O75AtvnCjB6zuygXk= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HFdc255G; spf=pass (imf10.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747885673; a=rsa-sha256; cv=none; b=NA+pekLwZh4cIW7jS3NJPPsq56AnU9hLi8mGZIST4LujuvriAUwstcKp2qUhdvqiAjqAbk TzA9SayZBwUZG2XffpMRLZZEGYiHO4+wQ8bFpnGmcqAknPIw2yu13mZduKMBOpsdIvxIUe auB2f/y1DCXeXr4UrIHi3s0EqIZP/+c= Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747885671; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BXXYi1vVEw7Vh0KRmb1QEzzCMPaHpSE9E8XA1i/G0ww=; b=HFdc255GB9u7lfA3f0vLiEbH7XqTxGC58vRB2WZ2AuOAwuLba4hFa0TmhWNJuYIJdBs7Io zjkY/HzfayV+tL/iqEy/Ft/VjbyteKKpa5NqtPw2Kj0hgSAzczZdNKE7pQO2D1ZQ0rJdV9 d19dX3LzYVUEPCBjdsVCl/K1Nw+Y1M4= Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.500.181.1.5\)) Subject: Re: [PATCH] mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <1747884137-26685-1-git-send-email-yangge1116@126.com> Date: Thu, 22 May 2025 11:47:05 +0800 Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, david@redhat.com, baolin.wang@linux.alibaba.com, osalvador@suse.de, liuzixing@hygon.cn Content-Transfer-Encoding: quoted-printable Message-Id: <644FF836-9DC7-42B4-BACE-C433E637B885@linux.dev> References: <1747884137-26685-1-git-send-email-yangge1116@126.com> To: yangge1116@126.com X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 79AEDC0007 X-Stat-Signature: mz851ea9zy9mnxrw4nayogs94uxwahys X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1747885673-882447 X-HE-Meta: U2FsdGVkX18G20hxzev8JqlcpGQdPvkNQXU8y2I0uNfoJpVHFgG9pa5NETSljZLak/p3dgc59E1BANgC1OcsFURjhoE3XgZ2xi+uyLmuYfl0xTJWXH9uq9YN8xolT7WjViv7PnB3Q6Dzj1+TOVMfAN8ihk25ay2KdWG6WlzuVpciEZAyjquolDQqPh6vUnsqNwcJunQlVxjzmgstI7cWTtD7Z1N75sriB9xlEybT8hpftpNlKRpHs3ilr/4hPhdSb91I4XZa26nywJWbDAf6eWwF8tUaPrr047ir3joD1NZG98rKcT6ZakYRORoUVFPc7N2EGvXEptgMJDcU18K4GgINAucZ8x8TxXPo70rAbx8xs0+2823ecu49roOy3UGuswYV/zoZ3mxuVRXerjVMex5A4nwwtu4IoJ5mBtTxp4q44HfdtfQ+StxnhWkPCFNL9gHK7PdQmrHsbo0P2GoqfP/3NC806R6QbW/mnvMcLsKxiVXQwCM783Jzz6gHWKblDJ7rNuXm0WnhVOLpxbF+P5y/5YHgzmENWQFpygPiKbrERgiXC/o0aBt2/QX984pln9i6oFKpZkYlhVOyEwGijeXMmSMOKk7aeTmdJgBPtjJ59HXULuVVTLzqqspmOACtHUf8sgNba0oV48t1FW15s93WjOvn0FIU89dlVcIWYIKHv/Z5ydNrzcYOB2wLEgsytz1wlBzjYJ+wYTV0OgU3WrOdASJBuqXxDp0BEdLSjdK+LJ1fTIeZwqJZrwC3FosyQGIWrEc8omxqIN9XF22caqvwhMYiA4kuK8w13K58frQqbYMduZN4On+W2OO16AIybHFZA5qS4BpvKPUuGTKWJsfvcphkEYP4kxvWvsulrOkIIFhw/P4s0H69ATqsoXzW7iJo2AG+0XriukzuVzXF5B2XLG1R+q/5nbKa2qnTUNllMhIIQouDiiffQGWuOJeOhJ9P3LF+dGAOlsSWLPu fkes/VE9 2CI2ppg1DjZVGEg09CxMTh/01VLLA/uP9nvvNrfUcIIJ5+TcWZ5dp/eeI1Rz3OFLJvXrIBr61v9QYdouGdCAJB6v4lBgsZubUr9xD+/ddmCKoPPITW76ANq1IVOOda7tVHTOegHzE6Wlq0196GZd60gXDcqxijzK+Ia15D/qJGd3p6HBWchnkaEBGDJDDPC9xajrcjSc51jpuqUNztxKqXLsO/0smU8E69S20cBjh5p7Ujcof4m8LsZXBWDt1cSwC+kMYs+jHlXGrRkOIxopctWVzak2XBx5WK31saDgpyaX7ovyoMdtDN0u5vF489IwIyOo3COJnqYw43so= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > On May 22, 2025, at 11:22, yangge1116@126.com wrote: >=20 > From: Ge Yang >=20 > A kernel crash was observed when replacing free hugetlb folios: >=20 > BUG: kernel NULL pointer dereference, address: 0000000000000028 > PGD 0 P4D 0 > Oops: Oops: 0000 [#1] SMP NOPTI > CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 = PREEMPT(voluntary) > RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0 > RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000 > RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000 > RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000 > R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000 > R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004 > FS: 00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) = knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0 > Call Trace: > > replace_free_hugepage_folios+0xb6/0x100 > alloc_contig_range_noprof+0x18a/0x590 > ? srso_return_thunk+0x5/0x5f > ? down_read+0x12/0xa0 > ? srso_return_thunk+0x5/0x5f > cma_range_alloc.constprop.0+0x131/0x290 > __cma_alloc+0xcf/0x2c0 > cma_alloc_write+0x43/0xb0 > simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110 > debugfs_attr_write+0x46/0x70 > full_proxy_write+0x62/0xa0 > vfs_write+0xf8/0x420 > ? srso_return_thunk+0x5/0x5f > ? filp_flush+0x86/0xa0 > ? srso_return_thunk+0x5/0x5f > ? filp_close+0x1f/0x30 > ? srso_return_thunk+0x5/0x5f > ? do_dup2+0xaf/0x160 > ? srso_return_thunk+0x5/0x5f > ksys_write+0x65/0xe0 > do_syscall_64+0x64/0x170 > entry_SYSCALL_64_after_hwframe+0x76/0x7e >=20 > There is a potential race between __update_and_free_hugetlb_folio() > and replace_free_hugepage_folios(): >=20 > CPU1 CPU2 > __update_and_free_hugetlb_folio replace_free_hugepage_folios > folio_test_hugetlb(folio) > -- It's still hugetlb folio. >=20 > __folio_clear_hugetlb(folio) > hugetlb_free_folio(folio) > h =3D folio_hstate(folio) > -- Here, h is NULL pointer >=20 > When the above race condition occurs, folio_hstate(folio) returns > NULL, and subsequent access to this NULL pointer will cause the > system to crash. To resolve this issue, execute folio_hstate(folio) > under the protection of the hugetlb_lock lock, ensuring that > folio_hstate(folio) does not return NULL. >=20 > Fixes: 04f13d241b8b ("mm: replace free hugepage folios after = migration") > Signed-off-by: Ge Yang > Cc: Thanks for fixing this problem. BTW, in order to catch future similar = problem, it is better to add WARN_ON into folio_hstate() to assert if = hugetlb_lock is not held when folio's reference count is zero. For this fix, LGTM. Reviewed-by: Muchun Song Thanks.