From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C162C4332F for ; Mon, 30 Oct 2023 12:25:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC7576B011F; Mon, 30 Oct 2023 08:25:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4F4F6B0120; Mon, 30 Oct 2023 08:25:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A15E6B0121; Mon, 30 Oct 2023 08:25:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 841F06B011F for ; Mon, 30 Oct 2023 08:25:17 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 61CB6B54D5 for ; Mon, 30 Oct 2023 12:25:17 +0000 (UTC) X-FDA: 81402047874.07.D8D7966 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf16.hostedemail.com (Postfix) with ESMTP id 4057318001C for ; Mon, 30 Oct 2023 12:25:14 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=E7xFlMYR; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=7XKA7izt; dmarc=none; spf=pass (imf16.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698668715; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6H2Hz968qnsjGDRsAw1Bd6MzPZMRvBR69VRmb0NiViQ=; b=in0TrWfalyeS3ovAdMmJoyMX7ejXBzKRwFTMmwfhFkPTzGcEEfHCsOinybN35qq6XmSves mIO+uu5t8rBYwMe2VTbzz3qsYyNO/Hkq5aqD1+nGpJ7N57uJlnvTY7abapYyhggXzoupzN o5VDw0zqtYsQazC71fjsn1g9KGTwyv8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=E7xFlMYR; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=7XKA7izt; dmarc=none; spf=pass (imf16.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698668715; a=rsa-sha256; cv=none; b=5vpYdOrvrfZH2lX5m6/6phUUFtstGqxmnTSGxCT6TtO+LW3k0m1W3pQX2gZZ8Fhkxz+81G 6r2BBqCRSPmKtztDANK0kghtuLFjx2Jof8QFFenL8k68DOAIT8D4ftHboCT9lhtTkU6L3W WUNCAE+8hyVryGIBtI+DJLEiqM2HFzs= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9295E1F45B; Mon, 30 Oct 2023 12:25:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1698668713; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6H2Hz968qnsjGDRsAw1Bd6MzPZMRvBR69VRmb0NiViQ=; b=E7xFlMYRHK+TvNgf9G3Ne9Q6LHDlZypFI+F0kfmjDzDQOwW95T9gkyjFMV/3w+zDKmyVs9 6ZpxXjmjCHVNEWQ5zXM4sGKheKdTdVG9CCqAtJNB1RP4W7meg+BIiRV0lJWP2v+XlIKyK8 giKkX+U8Z2PHKyqVofWOBWR24ZDocfs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1698668713; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6H2Hz968qnsjGDRsAw1Bd6MzPZMRvBR69VRmb0NiViQ=; b=7XKA7iztfpQJPbuPLuTcHsSlVNnoDltQtLe0T0dYCMf/nCeuWNGF+lhJLUjRqZeKfY7vTl sH14VXrVWs+JvxDA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 847C3138F8; Mon, 30 Oct 2023 12:25:13 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id X2BOIKmgP2WHcAAAMHmgww (envelope-from ); Mon, 30 Oct 2023 12:25:13 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 13322A05BC; Mon, 30 Oct 2023 13:25:13 +0100 (CET) Date: Mon, 30 Oct 2023 13:25:13 +0100 From: Jan Kara To: Vlastimil Babka Cc: Mikulas Patocka , Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= , Andrew Morton , Matthew Wilcox , Michal Hocko , stable@vger.kernel.org, regressions@lists.linux.dev, Alasdair Kergon , Mike Snitzer , dm-devel@lists.linux.dev, linux-mm@kvack.org, Jan Kara Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 Message-ID: <20231030122513.6gds75hxd65gu747@quack3> References: <89320668-67a2-2a41-e577-a2f561e3dfdd@suse.cz> <818a23f2-c242-1c51-232d-d479c3bcbb6@redhat.com> <18a38935-3031-1f35-bc36-40406e2e6fd2@suse.cz> <3514c87f-c87f-f91f-ca90-1616428f6317@redhat.com> <1a47fa28-3968-51df-5b0b-a19c675cc289@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1a47fa28-3968-51df-5b0b-a19c675cc289@suse.cz> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 4057318001C X-Stat-Signature: 999mj9419z6eoo9ge7p9m5q1x595yakt X-Rspam-User: X-HE-Tag: 1698668714-546495 X-HE-Meta: U2FsdGVkX19xW0gvsdMiLBd70d43lvfk77lATl2nuuhcfTHxEMIvIcm2cg5+GZso5fS5FmCsZUGs6tyc4imhDPtr0MznQnednVXHg0TqiNb3ODOn/QA5qboK5uq4DWTE2hz2PWfzfyVi2d9zsn7y76uqP9ZoGAFtmkmt0NIuQkXYThKuJ/pTK32mdJR5YSFW2/nL+SPYuEOIhMr3o0ehMA0PHlowjVKKUeUB3laHPnQg+8pf66jugRYD/AN7FRBAF1OToAPsxubaMceIVjVvblZrsxrki5oZX42IYCyaF+J7RKSdQKfp0pPRhFg/ugb4gGqs+BZ+oNma+IJVqG8PZVQf/Cp25om8K6GlXbG9YJMyZJzNNZaMY5TMRt0cj3a39nX75I5lXNjKvuJoo2/xdNNq7S6lBp/s/5h24ocgg/uNaJFHIkYWyXpJ2su/OE6cFMKTs8be0f/tH1JyddY186XxqQuNhsDctDsATR2ljZ8OqFmJjqSANzDAByMIrefB/GBaGsMJ00Eo9dVewLFI5taiUNyuFe2jysQr4Lnj+5biyMdVojDSw9/km9+m9Z7DBZ4hIbMYDDaLFDgSH33XqIkOmi2d/chgtHKCzs6VPU01vnBPS3BQVkBzBbuz2giqWNU4H47Aprtm99pc98QS5099MDJhSV8t8qTF7hyZhuKxOTEDtdS7P15n3mwzH49c6fZqoxOexyZMxNEC96ysYENeJh8jcahHB4UN7LhdiHeklJUE21FH/Ngk8WLS6JVIsr0ILgudTFk4lll585rWh2/bZ8t4rtgmwJ2p6kil8W/nZcVdTS8UzorFZloSouvw2RtATmmfDsQME+03rCinyzMsgcWez06b3xGYdGeEMcIIWy68Rrqldz/eMoBf7P0uNyvfpMlIxiq4ZZwf01wgIB2/QnzXdOpQqglBoiIG+jsK0muMJbfFRjT0OwlEW0wvmcwv1/GRbRZGWiypZ9u OAfhA+Mf lNqmqjkTKqUR8ovoIoC3uWIXts6JByzCzbxK5ScLDAkEERQ5erFmH2KElfVwLqG/qlGkR6xBq+QEoF8FSN9UzXZc2KTMU4tMpfRD/NWvi3tuHCcYEVwAptk6xoD88nKZSkEo3ZpZPpnksDTkBp8GHPFMP9SGo6MCyXOB3tPcPyBMyKiYKusB4ozlOcRFTuw4Dyjf7XkhFEZjOKb8mgWhvP2J4VaVzydFAxux/QfX0LrFqCmf6FU417ql0NIhzd9W2s0uWc9EVzbs9BxFRculSs6UBOhZO8/FxaqSMkVrCcxfD5GxLNxjIA/X8oVQMxV/kuLR9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 30-10-23 12:30:23, Vlastimil Babka wrote: > On 10/30/23 12:22, Mikulas Patocka wrote: > > On Mon, 30 Oct 2023, Vlastimil Babka wrote: > > > >> Ah, missed that. And the traces don't show that we would be waiting for > >> that. I'm starting to think the allocation itself is really not the issue > >> here. Also I don't think it deprives something else of large order pages, as > >> per the sysrq listing they still existed. > >> > >> What I rather suspect is what happens next to the allocated bio such that it > >> works well with order-0 or up to costly_order pages, but there's some > >> problem causing a deadlock if the bio contains larger pages than that? > > > > Yes. There are many "if (order > PAGE_ALLOC_COSTLY_ORDER)" branches in the > > memory allocation code and I suppose that one of them does something bad > > and triggers this bug. But I don't know which one. > > It's not what I meant. All the interesting branches for costly order in page > allocator/compaction only apply with __GFP_DIRECT_RECLAIM, so we can't be > hitting those here. > The traces I've seen suggest the allocation of the bio suceeded, and > problems arised only after it was submitted. > > I wouldn't even be surprised if the threshold for hitting the bug was not > exactly order > PAGE_ALLOC_COSTLY_ORDER but order > PAGE_ALLOC_COSTLY_ORDER > + 1 or + 2 (has that been tested?) or rather that there's no exact > threshold, but probability increases with order. Well, it would be possible that larger pages in a bio would trip e.g. bio splitting due to maximum segment size the disk supports (which can be e.g. 0xffff) and that upsets something somewhere. But this is pure speculation. We definitely need more debug data to be able to tell more. Honza -- Jan Kara SUSE Labs, CR