From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45C35C3ABC3 for ; Wed, 14 May 2025 00:56:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BC866B0096; Tue, 13 May 2025 20:56:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 26C396B0098; Tue, 13 May 2025 20:56:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10D446B0099; Tue, 13 May 2025 20:56:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E452E6B0096 for ; Tue, 13 May 2025 20:56:35 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 348F8161B9A for ; Wed, 14 May 2025 00:56:37 +0000 (UTC) X-FDA: 83439698034.02.68D6345 Received: from sea.source.kernel.org (sea.source.kernel.org [172.234.252.31]) by imf19.hostedemail.com (Postfix) with ESMTP id 55E791A0009 for ; Wed, 14 May 2025 00:56:35 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=q+dREfjf; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747184195; a=rsa-sha256; cv=none; b=hWw54P+kG93ZiIUIXtLEE9JWMcYIROKZKHIBqM8cSTdZ/ykZLZOu+pLE718GrgUBc0d56m NTiYa5gP/54YHXmJGqRFKi01Mj/p9toCPmWXFkhjdrgcMekgb5BKJ7wEnSa9ZTeXUV6VDN V9z0Dt68+j/2T96AGMnRR9AdmOowYno= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=q+dREfjf; dmarc=none; spf=pass (imf19.hostedemail.com: domain of akpm@linux-foundation.org designates 172.234.252.31 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747184195; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+6dBp9lU4/h2mPnXjRSEXwaM3FvkvPJOWNPZ3GSJfcE=; b=uj/P3hNLCaRT1gDoyj6XNnVr2kZHeLwaFBdX5EVMZ9yKrwTNa3JugtGkZe3VgKd2CE0liD C0oLlqn22nrU5YE3xZh4Cx71IhqN5yB8IxGcRdHVPDrjCJHEBzIsqlgVlpTm8UPrRf1c0A Z+8MFkdYbM85KtEdK6FEXurhfJvvW58= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sea.source.kernel.org (Postfix) with ESMTP id 0FBE144548; Wed, 14 May 2025 00:56:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A1EFC4CEE4; Wed, 14 May 2025 00:56:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1747184193; bh=5c62wOnQoIN2NBL/NFAZ1guCzR5VoFPHv8AgbDUOjPs=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=q+dREfjf+Nh7sa72RzFbwMvYBaAl8f1gBUEIzHFn1hArsGKg53HIKsRgZNYVrYIJH GroTtpf7DqS2vS9v4xKV7jVTEm/oUFTDQ0FI2VqxVxmU32gIDi7gDETBZ8dzFRsHbs TH4xQ9/33Ag6ysd8Ki4QLPsrAPQRsFxLcHCOks/w= Date: Tue, 13 May 2025 17:56:33 -0700 From: Andrew Morton To: Gavin Guo Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, muchun.song@linux.dev, osalvador@suse.de, kernel-dev@igalia.com, stable@vger.kernel.org, Hugh Dickins , Florent Revest , Gavin Shan , Byungchul Park Subject: Re: [PATCH] mm/hugetlb: fix a deadlock with pagecache_folio and hugetlb_fault_mutex_table Message-Id: <20250513175633.85f4e19f4232a68ab04c8e41@linux-foundation.org> In-Reply-To: <20250513093448.592150-1-gavinguo@igalia.com> References: <20250513093448.592150-1-gavinguo@igalia.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 55E791A0009 X-Rspam-User: X-Stat-Signature: thi185zudx1wpo3tgu5ym7fkco8ffbyd X-HE-Tag: 1747184195-186641 X-HE-Meta: U2FsdGVkX1/32rKmdsiepM4/bgYNP6JGo9ZVSX0aJ9p3vJU/RbpmBwxOlyNu0CqZMHLBBiAnFsTgDbKDHrtkVENt5G168mcLU0VVXqefom8PkfLvLNnW3pIrxATCwdHKZcgbJYYfv3JD7eKUp8brj7yOGTq5o3VNF0KHCdYLyv+1oUe3THTnEsAZPpERbt1dIMSgZhdKvKmyA86FhFsZpEgmc3RA/h8HFo9a4f1F//Rus8cUZ2L9WdN1d1/riqrYbfqQs6LMMqAmSouQmtY36tPsSNt1tRE1XXHX4qqXXKae2scSwfWV+J/04h3VGOAD80VxJdJMprhUP5vDW1RdRLfKaVa76EYNWPBpqaWQbWXcvS++Ijyd8O23edBpW64QsNpFegfyhFxEmbK0cMdp/HINTqr9m66yGCxPQbE4HlVT9D1Vre7g7iyWglixldrqdOlum+2KKEEYoV5YmL6dq0+mCOZcDgHqn2fertUzGIWcCTrGI1ITH086+wtwKxi9+3CIWz9O21qfLjYV/zF8kU0AkDXLtB1Kp2X43Sr2HWUMd5nXO8JyzuwwmOfSzHbz9wO+qN3gpLOQM/TNCvt3SXkaNPUmUHVQwowSFaZDkS7DFLxQy4xua9VmdwK1XerNCyoOSc1onUTABSJVQJN1WJAPYM0ljhpbtZH7RTqrcJBpOm6arF6AK166fMt2VEnGUQPqfkl04PiWcjWhzK3+vl/F/wkdw6E6LZICVo04DP3PrWI9CQCJv09YEGxNh4Pk1ZIxEXtR6N6eM41zxfwLODTmZVOLk+9wxn8QmD4lenKyMHlyqLA+wAQLzCEMtHb4VbuSI2KspF88FkS7AFWhXEkU0I+7JV7bX1Dd68W8WVdu+58fJ5u+0i3B7rRFL2m0XrbKmZ0Xs9I+hwLXZmLrpI/ff25czvfMRP4+1JLc+PWgQH6apORnXxKT7aqVLgLWAJtbGAnFMLUxrFPo7+M jbpikgnA sXjKv5I1XsKR6D4qp4TjjOe7CXTqCIuK8XYWxxtJkdPluzBVZUl/WSz0oreh9IEZWYF3t5gExIsmgmculxaGSrGIpCLbw9V53fpQ72+JUchO1rBmuMprttnhek4n9OoHu5fCOMmxcCRhmgJ8HGbl5XwnOiUdF9xPHMiPSEk5PzZr/3nBa0I46wAUEsCtqwBE4pDro8EiZUUbrNw3/XnghgiOEhuCnSLGl2riZl1ChaGEV9Un4RMleV6mpf6P64A2pHrVKxgn44zS4QN9nBuT3MNDaYxsY7jrRvVS9OWfllYUx/XmiZx64aaEVof9C4b/ESbJddjLYGIjDfyDboNeugREX3HtLDb8+t4v3DmFU92SSTP7ZAWIvhhU9GFLkCao0mvypcfwSLvp9uO0Rf0To/cqD5DTieVvh6nLJfKEan27NnXACvPByM+LMsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 13 May 2025 17:34:48 +0800 Gavin Guo wrote: > The patch fixes a deadlock which can be triggered by an internal > syzkaller [1] reproducer and captured by bpftrace script [2] and its log > [3] in this scenario: > > Process 1 Process 2 > --- --- > hugetlb_fault > mutex_lock(B) // take B > filemap_lock_hugetlb_folio > filemap_lock_folio > __filemap_get_folio > folio_lock(A) // take A > hugetlb_wp > mutex_unlock(B) // release B > ... hugetlb_fault > ... mutex_lock(B) // take B > filemap_lock_hugetlb_folio > filemap_lock_folio > __filemap_get_folio > folio_lock(A) // blocked > unmap_ref_private > ... > mutex_lock(B) // retake and blocked > > This is a ABBA deadlock involving two locks: > - Lock A: pagecache_folio lock > - Lock B: hugetlb_fault_mutex_table lock Nostalgia. A decade or three ago many of us spent much of our lives staring at ABBA deadlocks. Then came lockdep and after a few more years, it all stopped. I've long hoped that lockdep would gain a solution to custom locks such as folio_wait_bit_common(), but not yet. Byungchul, please take a look. Would DEPT (https://lkml.kernel.org/r/20250513100730.12664-1-byungchul@sk.com) have warned us about this? > > ... > > The deadlock occurs between two processes as follows: > > ... > > Fixes: 40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization") > Cc: It's been there for three years so I assume we aren't in a hurry. The fix looks a bit nasty, sorry. Perhaps designed for a minimal patch footprint? That's good for a backportable fixup, but a more broadly architected solution may be needed going forward. I'll queue it for 6.16-rc1 with a cc:stable, so this should be presented to the -stable trees 3-4 weeks from now.