From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6281C4167B for ; Mon, 30 Oct 2023 11:30:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 843956B019C; Mon, 30 Oct 2023 07:30:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7F4066B019D; Mon, 30 Oct 2023 07:30:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E3A26B019F; Mon, 30 Oct 2023 07:30:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5C8BF6B019C for ; Mon, 30 Oct 2023 07:30:28 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 35AE5B590F for ; Mon, 30 Oct 2023 11:30:28 +0000 (UTC) X-FDA: 81401909736.11.AF252C9 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf06.hostedemail.com (Postfix) with ESMTP id 06D8118002C for ; Mon, 30 Oct 2023 11:30:25 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qccV33Zd; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ZOY0WcId; spf=pass (imf06.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698665426; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=svj5aXOYkyIrxyCMubwNQeGYS1aCFXNzjgS3/u3RebQ=; b=Bwngj4HRuiEsgveDtj2pGS/LqfdsS5Co24wnhUl5JYfsat87vGIIK47wG4GNTEu/YKMDWt EUAyNl+RnZV4LgSIDzVkkrJ7lihodZ/WdRsVmioOTK/8GCoY2foW6BC5G2bedK9bWE3Gt4 NiGj8zFXJan7iusZuLbPGtWw1FpRXhA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698665426; a=rsa-sha256; cv=none; b=zwKdF3i4xMmRpsuKMPFgipHNF5CT7gzIVOZAASmxKmY9pUCBnuJ0ZsA7FS0rKmKQGf1/Ca K9qu80l2/rvo0a5dNAWKGWmvY5HM/OpQFMSQVMf4L9Wr4TuTxJGZPRW7CUX9YQ3N9BBrx+ 01qclGOwB1SOPae0CryJ7VH0x7cECG0= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=qccV33Zd; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=ZOY0WcId; spf=pass (imf06.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6982A1FF01; Mon, 30 Oct 2023 11:30:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1698665424; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=svj5aXOYkyIrxyCMubwNQeGYS1aCFXNzjgS3/u3RebQ=; b=qccV33ZdHgl81XZ4Jjr7rgR08yYp2ZwsAd0YE5W3uvCkvjrN9wLecL+pDl03EOMT9BqbNp Cpp4ipdwPO2qOeR87VgEA5+1aBjbOTgfjHKVeW//rIR9VRDkkwr9/aYFChEI93HGohxNMs xpI6K7WtNP+gZ0IC2pAQtezEWyrnZcI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1698665424; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=svj5aXOYkyIrxyCMubwNQeGYS1aCFXNzjgS3/u3RebQ=; b=ZOY0WcIdyF5OIgMvaerYn6Iit0Xf2WlqtFigfsR/3DQOGMES7oN01B/HfjjwIizJSf72ox tdS/blqrXI1SZCBA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 379B5138F8; Mon, 30 Oct 2023 11:30:24 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id LGzPDNCTP2WQTwAAMHmgww (envelope-from ); Mon, 30 Oct 2023 11:30:24 +0000 Message-ID: <1a47fa28-3968-51df-5b0b-a19c675cc289@suse.cz> Date: Mon, 30 Oct 2023 12:30:23 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 Content-Language: en-US To: Mikulas Patocka Cc: =?UTF-8?Q?Marek_Marczykowski-G=c3=b3recki?= , Andrew Morton , Matthew Wilcox , Michal Hocko , stable@vger.kernel.org, regressions@lists.linux.dev, Alasdair Kergon , Mike Snitzer , dm-devel@lists.linux.dev, linux-mm@kvack.org, Jan Kara References: <89320668-67a2-2a41-e577-a2f561e3dfdd@suse.cz> <818a23f2-c242-1c51-232d-d479c3bcbb6@redhat.com> <18a38935-3031-1f35-bc36-40406e2e6fd2@suse.cz> <3514c87f-c87f-f91f-ca90-1616428f6317@redhat.com> From: Vlastimil Babka In-Reply-To: <3514c87f-c87f-f91f-ca90-1616428f6317@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: 6stttgono9emy9g968dsyebju1tzknmr X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 06D8118002C X-Rspam-User: X-HE-Tag: 1698665425-863747 X-HE-Meta: U2FsdGVkX1+/Bp1/vGZG2IwGLXYRRoXl/3TGx7Ik/ErSVfddXwJj2arSVGLBBT1QJW4VI1Pm7PqCoDfoFU8bg52b/Lz/8jW+6f/WOMWEMPZik10bdIlFaKBQOhEzOMjxl87XRhYihm7HEMwr6NGGLm8nHNCDAGimtnSouw8uxHdHo/5j/j8IH/r7xMEBzQPlS4+fZchPk1ZpjpGcX6KIRJQJt/Pw/NoZlYR2T+HE2KvBZe45hdFQsFYUNwqCEaeQM2QkAGJLX6BPAw5kIPBE0+2xZoYAhCeLNw4k4gwLn6bNgwyHZNWu/xUivF4Gc2IEZ2QBhdfiXvs6mPWbkCOx6hDLx8yoCLHgyC2LHpdqownzKKFuA1BgIVWZAiuiwX0Osh0aRFKuHX2DiCYI3NeCmXb19z5MONoTsrBS2ifjbqWjepJA3VuncR3QFXbsIIp/3r6iOAN6zruynknRUi2CzA9mGuCJuDPcnIXImrwOjxGHXYaoAizwz68U9jkyU3OEgYhESoYOfF95EALLN5/q4t4XwxFYAlsqeJ2OUVgPqt3NFZLk43MxVpvlCJ8/j1o+hIj2MHrNizRUxculyqy4BpjGbiiAYbJMgHM/8SyQVZD1daDJNMPOaVXeRiKUqC/Uu9MiW0JwTqnz3v+qzSG1McmdCtGuSITFqVRV8CqVCl8EsXFAqUALiwWj0SHNVz9BT8DgdzBEdjAzuvT3K6qVEFMRnzW7A6IzE5UO5d6CgC43q7wIap+jF/6/CZm+FH91UAoiImfCu7AZ77pr9PQv4dRcqOPNkeGHcCb9EC9IkH6yQkDHc7QSMt71i7oGkbqKN690I5FEkLraFGssA3NQ45gOVq8oIPqMvVuFvdUCyxPsQcjeYFQAlScuwQga31yMXD9vaRcyelvrvx0YfSdQBi6hYtaHtATE6bCfo6PPC75Th505nAE9hdn4Id4QJfyttxKdBH756bE0gGPpaNQ fK6oPJHl QFGVZw6MfrqbLbyMU+D0pOjXQIEZom/ydeLAzDqhu2oZhP/nofQIGC8tr4AFWzVrIEXl+sIK29nKU3KT3p2MFk93J/8NdS5Av7nHdzuE9Cu0dG7l0gjk90NTgjdUJ2+Zg7vMeC8ZF+P10B1mtzbKD8mi0LhxNQetkZcYtTZuqjqHgCr9mDbGavtO59dnd7xm42mREKoXxCkQRnKpNi7aipkWIolCgZzsi0qGcXzpoxdkan/DNaUB1/DysO1+csT1Ig2NwVSXL+JMw7Qr9x79dStguZ1uTpARiDbjB6i+ydjr/EV6Kf3Igs8NI1g/EBYILuUw/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 10/30/23 12:22, Mikulas Patocka wrote: > > > On Mon, 30 Oct 2023, Vlastimil Babka wrote: > >> Ah, missed that. And the traces don't show that we would be waiting for >> that. I'm starting to think the allocation itself is really not the issue >> here. Also I don't think it deprives something else of large order pages, as >> per the sysrq listing they still existed. >> >> What I rather suspect is what happens next to the allocated bio such that it >> works well with order-0 or up to costly_order pages, but there's some >> problem causing a deadlock if the bio contains larger pages than that? > > Yes. There are many "if (order > PAGE_ALLOC_COSTLY_ORDER)" branches in the > memory allocation code and I suppose that one of them does something bad > and triggers this bug. But I don't know which one. It's not what I meant. All the interesting branches for costly order in page allocator/compaction only apply with __GFP_DIRECT_RECLAIM, so we can't be hitting those here. The traces I've seen suggest the allocation of the bio suceeded, and problems arised only after it was submitted. I wouldn't even be surprised if the threshold for hitting the bug was not exactly order > PAGE_ALLOC_COSTLY_ORDER but order > PAGE_ALLOC_COSTLY_ORDER + 1 or + 2 (has that been tested?) or rather that there's no exact threshold, but probability increases with order. >> Cc Honza. The thread starts here: >> https://lore.kernel.org/all/ZTNH0qtmint%2FzLJZ@mail-itl/ >> >> The linked qubes reports has a number of blocked task listings that can be >> expanded: >> https://github.com/QubesOS/qubes-issues/issues/8575 > > Mikulas >