From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 500CEC54E71 for ; Wed, 21 May 2025 13:33:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D27396B009D; Wed, 21 May 2025 09:33:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CFF286B009E; Wed, 21 May 2025 09:33:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEDF06B009F; Wed, 21 May 2025 09:33:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A02266B009D for ; Wed, 21 May 2025 09:33:44 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 49B0CE468D for ; Wed, 21 May 2025 13:33:44 +0000 (UTC) X-FDA: 83467007568.07.3B8C7E0 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf19.hostedemail.com (Postfix) with ESMTP id F2F9E1A0002 for ; Wed, 21 May 2025 13:33:41 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="JitxMd/z"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=HuCtZxF+; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="JitxMd/z"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=HuCtZxF+; spf=pass (imf19.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747834422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+GpFh+YzT6LkTyhq1+ztaaJC5X679keKrZkWxYGe3Tk=; b=4VQUSerdFeLxsUWT/o5mGjpY6swIU8vBZBMxPq57gf8uQzTekH4H2pvDC+mSTTubkirMWl 0biuQcKRP8igeWHeabRk+zFp9ZJUdcHUCtDE0i6l3daWA2yDKhU4+ASExnMfGa1MqGo4eJ bO3vxAAf8lFNsVJDHhoTrfYVayq0sZo= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="JitxMd/z"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=HuCtZxF+; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="JitxMd/z"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=HuCtZxF+; spf=pass (imf19.hostedemail.com: domain of osalvador@suse.de designates 195.135.223.131 as permitted sender) smtp.mailfrom=osalvador@suse.de; dmarc=pass (policy=none) header.from=suse.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747834422; a=rsa-sha256; cv=none; b=FTgHP7bIGpES0tLSd0xQ8iDGMkZGueHbOa9zH9Nre5evHlVjXXaYFmViOw0dlngfmi9yHQ Te96GecBVNx49m3ZoiqZazBrMVSItMp4TX/wO9wUrSaHVPv8AkGU6xR/vmTbJ0AB7ktZ6a aEPAUUx7AnQ6vQ/CsOE6oteLD62B1fk= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5F2341F88A; Wed, 21 May 2025 13:33:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1747834420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+GpFh+YzT6LkTyhq1+ztaaJC5X679keKrZkWxYGe3Tk=; b=JitxMd/zbUTlCWCL3mWGGQnXpIZd9ujBtRqslfPlmATt4lUoaPWfbbCRetOd5079TwCkEv 6rJAXipG/by9/vVWH1iyNBp0X7NpUj5pLnWmyWbZyOyiL/lpF68BdOk6j/JSNuzAxxyXkF 7oPEf6ba3P3zhDcoQ9GVp9M2xV/Gjgs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1747834420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+GpFh+YzT6LkTyhq1+ztaaJC5X679keKrZkWxYGe3Tk=; b=HuCtZxF+CRF+yNPT7ats0Ii7fiZJ7dgLjPOfOhXmJ5BzW5+POzitpimz9ESdHJpSgcnuNv IwninOUJh2RLcBDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1747834420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+GpFh+YzT6LkTyhq1+ztaaJC5X679keKrZkWxYGe3Tk=; b=JitxMd/zbUTlCWCL3mWGGQnXpIZd9ujBtRqslfPlmATt4lUoaPWfbbCRetOd5079TwCkEv 6rJAXipG/by9/vVWH1iyNBp0X7NpUj5pLnWmyWbZyOyiL/lpF68BdOk6j/JSNuzAxxyXkF 7oPEf6ba3P3zhDcoQ9GVp9M2xV/Gjgs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1747834420; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+GpFh+YzT6LkTyhq1+ztaaJC5X679keKrZkWxYGe3Tk=; b=HuCtZxF+CRF+yNPT7ats0Ii7fiZJ7dgLjPOfOhXmJ5BzW5+POzitpimz9ESdHJpSgcnuNv IwninOUJh2RLcBDQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 27C3513888; Wed, 21 May 2025 13:33:40 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 8dDxCDTWLWgXBgAAD6G6ig (envelope-from ); Wed, 21 May 2025 13:33:40 +0000 Date: Wed, 21 May 2025 15:33:34 +0200 From: Oscar Salvador To: Gavin Guo Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, akpm@linux-foundation.org, mike.kravetz@oracle.com, kernel-dev@igalia.com, stable@vger.kernel.org, Hugh Dickins , Florent Revest , Gavin Shan Subject: Re: [PATCH v2] mm/hugetlb: fix a deadlock with pagecache_folio and hugetlb_fault_mutex_table Message-ID: References: <20250521115727.2202284-1-gavinguo@igalia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20250521115727.2202284-1-gavinguo@igalia.com> X-Rspamd-Action: no action X-Stat-Signature: 81iiz3j45ia18fpsnbiqpjmz3mm9f8tw X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: F2F9E1A0002 X-HE-Tag: 1747834421-103580 X-HE-Meta: U2FsdGVkX1+CjlmuOBoWpDIvpup9CNTW23zHLW+Cc7b2LSpE5AI/EWPq8PNfaRWcvpR1FhjdtQtn4fbtEcpZ7eBnpqcpIqaBPJdEDxsiSTSyFoGjtJ+64ubEvD+anBJywwOaon6B8Ojnz2eCvA6t/PI5owJ73aO7NxOa1xozr9XQTTtXcW5Vxf9vPzWfC8kEm6nA151rBx3V82HrFUXzZrCMz8bzN22p19zPl7nkJ13It5BKBhiM2Zx7Jvj30nANtSKfYR8ERMsZW92zD/d424zqg5gCrC36PthpjO3jJ4jyUv+6jesWjWtGIp+towu6HgkIkjWv/BFtUPuOjxvthq6Z0NUrA/tmKdM63NtGHikGspukZF+SM1zc7Ci9BL8zOH2wqnYwn3eJQxU0BYREMwgPY48/Dmd+AHHfStfJhaNtl3ran6UO2PnEKqBgrJz96nBlrzQEm0QRDQTz2dINiBM8bv1BdFvnq1/fyuuw7S2+kdMAAqBMhQu/WHIDfKIqNVJdt6ZvwIbVMCItPvWw5adkRt8oIEKISNRNK83pTiHXNxIomKY9XVFb/CqiK3VCLokAb2oIrwHY/m6u8zjiANPM27yAqa5H9C4ccM2+Y5GyYYlT1scw1tamGgKXy7vNzfUg+P4FpZeVaggwf2bh/0J/PYxRFS4SIP6pLwR5iDtUrEISrbLPV9OQewvqL606nXkBSY1XLsl/MUDnSnMjG7JfQU0xptsA89FUw1gEDPhOKyoFnRaEsuiwrTv3l9tfv6fnpqx7wuu7R4erwj4VcWUFESx0/2kscuDYT9x2aw3lYbS+GRnmV7Hknfu/3PF4X3SuJwE+p5NI9IPy662rSn4XOHnsxYgJPZnCtUoR/sjTbhQKFP30Ww59/r5bIV46CA5Lc0t/W3Y5pxyDCtRAYlmA6PHBaQCos9Iddmfyfr7u2YCv44Z/asI4ytDDHz0PF60ABSniWHZ98eR2vnK DZJgZzW9 bKQI/Kri2mLGC/bXhTBCaiNHSfMCmOATN1W5CT/4ejn2B0ahqM6YGIOxiF99MSmBcQqT5aaX/vILCwvJHeMXJKHJu3WIgeI0VZZrgBKwbcWKv9O+5Goo013N8fYiUO5+l1/YOFxQXy4DCzGbwVSUVj5fnZGZtnOCD+JgChN4QYdf/hfK6Tt7bd2DDFufWymfwjTj9Et8acmZBoNMirvMV3mWfdPT1GJFuQqXi1w1CzZ6PSBtMIQnenHb6mM7g1SsU0okflOOKva+SGqNq9xaksPVZLMDOk2PwgfYKLKLkfSXGFm8YOA2q+523jSHm28swl8IozKcBq+Zc3O1UtszZ+tIjB4fOLnN969IyiHkJrH3IRua1Q8IJPNknW9ZsLNXdGpbidSiYVVhObr7m5H4+GuV86EGK2lbKtTA/X4e0fSai4s2zQG3LAEEZHZOUtjO67hs3829+YZkxofjRAVm7ER36Yy/8mYnKOVK7D9wX05HnwbDjpzP62w/RWlHXR9dLw4p8DwSEDMRFPERiSYLK4yc8YwF8SJE70a2g X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 21, 2025 at 07:57:27PM +0800, Gavin Guo wrote: > The patch fixes a deadlock which can be triggered by an internal > syzkaller [1] reproducer and captured by bpftrace script [2] and its log > [3] in this scenario: > > Process 1 Process 2 > --- --- > hugetlb_fault > mutex_lock(B) // take B > filemap_lock_hugetlb_folio > filemap_lock_folio > __filemap_get_folio > folio_lock(A) // take A > hugetlb_wp > mutex_unlock(B) // release B > ... hugetlb_fault > ... mutex_lock(B) // take B > filemap_lock_hugetlb_folio > filemap_lock_folio > __filemap_get_folio > folio_lock(A) // blocked > unmap_ref_private > ... > mutex_lock(B) // retake and blocked > > This is a ABBA deadlock involving two locks: > - Lock A: pagecache_folio lock > - Lock B: hugetlb_fault_mutex_table lock > > The deadlock occurs between two processes as follows: > 1. The first process (let’s call it Process 1) is handling a > copy-on-write (COW) operation on a hugepage via hugetlb_wp. Due to > insufficient reserved hugetlb pages, Process 1, owner of the reserved > hugetlb page, attempts to unmap a hugepage owned by another process > (non-owner) to satisfy the reservation. Before unmapping, Process 1 > acquires lock B (hugetlb_fault_mutex_table lock) and then lock A > (pagecache_folio lock). To proceed with the unmap, it releases Lock B > but retains Lock A. After the unmap, Process 1 tries to reacquire Lock > B. However, at this point, Lock B has already been acquired by another > process. > > 2. The second process (Process 2) enters the hugetlb_fault handler > during the unmap operation. It successfully acquires Lock B > (hugetlb_fault_mutex_table lock) that was just released by Process 1, > but then attempts to acquire Lock A (pagecache_folio lock), which is > still held by Process 1. > > As a result, Process 1 (holding Lock A) is blocked waiting for Lock B > (held by Process 2), while Process 2 (holding Lock B) is blocked waiting > for Lock A (held by Process 1), constructing a ABBA deadlock scenario. > > The error message: > INFO: task repro_20250402_:13229 blocked for more than 64 seconds. > Not tainted 6.15.0-rc3+ #24 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:repro_20250402_ state:D stack:25856 pid:13229 tgid:13228 ppid:3513 task_flags:0x400040 flags:0x00004006 > Call Trace: > > __schedule+0x1755/0x4f50 > schedule+0x158/0x330 > schedule_preempt_disabled+0x15/0x30 > __mutex_lock+0x75f/0xeb0 > hugetlb_wp+0xf88/0x3440 > hugetlb_fault+0x14c8/0x2c30 > trace_clock_x86_tsc+0x20/0x20 > do_user_addr_fault+0x61d/0x1490 > exc_page_fault+0x64/0x100 > asm_exc_page_fault+0x26/0x30 > RIP: 0010:__put_user_4+0xd/0x20 > copy_process+0x1f4a/0x3d60 > kernel_clone+0x210/0x8f0 > __x64_sys_clone+0x18d/0x1f0 > do_syscall_64+0x6a/0x120 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > RIP: 0033:0x41b26d > > INFO: task repro_20250402_:13229 is blocked on a mutex likely owned by task repro_20250402_:13250. > task:repro_20250402_ state:D stack:28288 pid:13250 tgid:13228 ppid:3513 task_flags:0x400040 flags:0x00000006 > Call Trace: > > __schedule+0x1755/0x4f50 > schedule+0x158/0x330 > io_schedule+0x92/0x110 > folio_wait_bit_common+0x69a/0xba0 > __filemap_get_folio+0x154/0xb70 > hugetlb_fault+0xa50/0x2c30 > trace_clock_x86_tsc+0x20/0x20 > do_user_addr_fault+0xace/0x1490 > exc_page_fault+0x64/0x100 > asm_exc_page_fault+0x26/0x30 > RIP: 0033:0x402619 > > INFO: task repro_20250402_:13250 blocked for more than 65 seconds. > Not tainted 6.15.0-rc3+ #24 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:repro_20250402_ state:D stack:28288 pid:13250 tgid:13228 ppid:3513 task_flags:0x400040 flags:0x00000006 > Call Trace: > > __schedule+0x1755/0x4f50 > schedule+0x158/0x330 > io_schedule+0x92/0x110 > folio_wait_bit_common+0x69a/0xba0 > __filemap_get_folio+0x154/0xb70 > hugetlb_fault+0xa50/0x2c30 > trace_clock_x86_tsc+0x20/0x20 > do_user_addr_fault+0xace/0x1490 > exc_page_fault+0x64/0x100 > asm_exc_page_fault+0x26/0x30 > RIP: 0033:0x402619 > > > Showing all locks held in the system: > 1 lock held by khungtaskd/35: > #0: ffffffff879a7440 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x30/0x180 > 2 locks held by repro_20250402_/13229: > #0: ffff888017d801e0 (&mm->mmap_lock){++++}-{4:4}, at: lock_mm_and_find_vma+0x37/0x300 > #1: ffff888000fec848 (&hugetlb_fault_mutex_table[i]){+.+.}-{4:4}, at: hugetlb_wp+0xf88/0x3440 > 3 locks held by repro_20250402_/13250: > #0: ffff8880177f3d08 (vm_lock){++++}-{0:0}, at: do_user_addr_fault+0x41b/0x1490 > #1: ffff888000fec848 (&hugetlb_fault_mutex_table[i]){+.+.}-{4:4}, at: hugetlb_fault+0x3b8/0x2c30 > #2: ffff8880129500e8 (&resv_map->rw_sema){++++}-{4:4}, at: hugetlb_fault+0x494/0x2c30 > > Link: https://drive.google.com/file/d/1DVRnIW-vSayU5J1re9Ct_br3jJQU6Vpb/view?usp=drive_link [1] > Link: https://github.com/bboymimi/bpftracer/blob/master/scripts/hugetlb_lock_debug.bt [2] > Link: https://drive.google.com/file/d/1bWq2-8o-BJAuhoHWX7zAhI6ggfhVzQUI/view?usp=sharing [3] > Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization") > Cc: stable@vger.kernel.org > Cc: Hugh Dickins > Cc: Florent Revest > Cc: Gavin Shan > Suggested-by: Oscar Salvador > Signed-off-by: Gavin Guo Acked-by: Oscar Salvador -- Oscar Salvador SUSE Labs