From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B55C0C4332F for ; Mon, 30 Oct 2023 13:31:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36C4B6B01F4; Mon, 30 Oct 2023 09:31:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F54B6B01F5; Mon, 30 Oct 2023 09:31:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16F206B01F6; Mon, 30 Oct 2023 09:31:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 01E386B01F4 for ; Mon, 30 Oct 2023 09:31:01 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 98D3314085F for ; Mon, 30 Oct 2023 13:31:00 +0000 (UTC) X-FDA: 81402213480.26.7F5A95C Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by imf05.hostedemail.com (Postfix) with ESMTP id 457FD100020 for ; Mon, 30 Oct 2023 13:30:58 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=invisiblethingslab.com header.s=fm3 header.b=AyrxhP58; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=gLgLdkDV; dmarc=none; spf=none (imf05.hostedemail.com: domain of marmarek@invisiblethingslab.com has no SPF policy when checking 66.111.4.29) smtp.mailfrom=marmarek@invisiblethingslab.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698672658; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cxRdquET5IcWw6tJN9ZPpELfajRwqkrG2nYGfkr+4Hk=; b=B+P5ZTYNJ55G0QsilUP7WSNGq9FCt/u+7WLesdnB46ueOtL+CGXpqclv+euGPdWIf7QaIR 3m2EljDc/EZmUQo5cDAbGZ0gDl/6D8DhQ/rd9h7A7nHIdytxypbteqLtge5TeQTZy2w0gL eYWiS1VY3M3zyWL1+6hF7f3VKMYabrA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=invisiblethingslab.com header.s=fm3 header.b=AyrxhP58; dkim=pass header.d=messagingengine.com header.s=fm3 header.b=gLgLdkDV; dmarc=none; spf=none (imf05.hostedemail.com: domain of marmarek@invisiblethingslab.com has no SPF policy when checking 66.111.4.29) smtp.mailfrom=marmarek@invisiblethingslab.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698672658; a=rsa-sha256; cv=none; b=NSeUXSPtmHEND6su9NMqE2FCdR8jleYT4Z6rzM6kndwDxJyaTe9oElwMVbxdOvRNbUUPtv s1MJqiDn7EqoVKDu0YWvT50re4d316Fy/lzj90HuYjlwkRCGMhjv1P4ptJFFukb7uYS/5k bu89NOWHwcdE+0QANnM5o8ANauQa5rw= Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 92E9B5C0183; Mon, 30 Oct 2023 09:30:57 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Mon, 30 Oct 2023 09:30:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1698672657; x=1698759057; bh=cxRdquET5IcWw6tJN9ZPpELfajRwqkrG2nY Gfkr+4Hk=; b=AyrxhP58SgXnAJRviF2S7+Jd605s72LJBD7KEbs4C2EFnKrbMME MZHVA5sPPVxerKbQZFiKX7/cR0FkiVIf0FCvu8SkOH9q5rP0YhcPNcYtKEYEKFO6 4cug8Y7gBi8gc9SXg2fdmx+0HO7+4uO5SbQ1g4gN3L/nxIXCe6tJNyBRqHFceEJD v1GcRwrocn9CjtmirDbYi79iH3Pd4MoNjoEUgPbRPuKXtYGx7kL7QkV6vFM43vz1 VLm8QrwW7VA2vVM+CjrgUfg9+G6w1zmlEaQyAact1FDs7Jtw0OYnlhLa2evUvBuz 4Cy8zo3eYMw3OOOyuGRWAKAeI9EdnpmSn/g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1698672657; x=1698759057; bh=cxRdquET5IcWw 6tJN9ZPpELfajRwqkrG2nYGfkr+4Hk=; b=gLgLdkDVRsEsGX1QkLKfJjL7Q8+D1 NXEthBj9qSdAS1saceUm75sOOxgxoMsmGbE/hC1HD+pXP1ptT2KzcBbEqleZzuQB edh2ZUDVpW4W/BkyhNerYamcvqfw3OldW47/MQSPVgqmb2jdN6ASNVdwfaE9Fl5+ arpM1o+6Bif5F2aj1A+uQ7NPRKJeem7rox9EOlSGYIDr4419WEatBXpID3TfroP0 mVSr56oXgImnHZgFyQiRbbOgcWee2BuAv+n/Frh7t1spCxW0RwO6MR9v2foX8e0k BH2GxQttZ0d6Z21WKVfaYjG9D4LN6F8UDZkpQ1P7aDP9xvOAvL3lhlfdg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedruddttddghedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomhepofgrrhgv khcuofgrrhgtiiihkhhofihskhhiqdfikphrvggtkhhiuceomhgrrhhmrghrvghksehinh hvihhsihgslhgvthhhihhnghhslhgrsgdrtghomheqnecuggftrfgrthhtvghrnhepgfdu leetfeevhfefheeiteeliefhjefhleduveetteekveettddvgeeuteefjedunecuvehluh hsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrrhhmrghrvghk sehinhhvihhsihgslhgvthhhihhnghhslhgrsgdrtghomh X-ME-Proxy: Feedback-ID: i1568416f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 30 Oct 2023 09:30:55 -0400 (EDT) Date: Mon, 30 Oct 2023 14:30:52 +0100 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= To: Jan Kara Cc: Vlastimil Babka , Mikulas Patocka , Andrew Morton , Matthew Wilcox , Michal Hocko , stable@vger.kernel.org, regressions@lists.linux.dev, Alasdair Kergon , Mike Snitzer , dm-devel@lists.linux.dev, linux-mm@kvack.org Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 Message-ID: References: <89320668-67a2-2a41-e577-a2f561e3dfdd@suse.cz> <818a23f2-c242-1c51-232d-d479c3bcbb6@redhat.com> <18a38935-3031-1f35-bc36-40406e2e6fd2@suse.cz> <3514c87f-c87f-f91f-ca90-1616428f6317@redhat.com> <1a47fa28-3968-51df-5b0b-a19c675cc289@suse.cz> <20231030122513.6gds75hxd65gu747@quack3> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="gCkAclEDVL+x2a5T" Content-Disposition: inline In-Reply-To: <20231030122513.6gds75hxd65gu747@quack3> X-Rspamd-Queue-Id: 457FD100020 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: d3xbb9wbngiftfhf79eowfdhq4by9cro X-HE-Tag: 1698672658-46624 X-HE-Meta: U2FsdGVkX1+xqHS2KR63mRiyKPoVlO71V5LCK8VaQf/JDZi33vSym/gp2j6RoVYClStl80SC7eNe/y9QjUASLJZp6xvoljIQQsJeTFExfwlCWUnuBa/77qV+OOhWMav1+V5NO1lFvh6KS5MHOu+XUA7QadUNQU0QEA2xflFDu6VeAzaY4zxKVlP9ivSw6Mjae2phw9omAixRcO6T+o7uETrbBHi3vBtQH7m0GnXXF7lIWLRnYLOFqrwOQCWEMTHhZ07q+57la+jRSNoHk7jXTJrbMBZuSUdABqma8erc+WjHVN2mj4zd1luppAqFIeWbC24a8EyA1zrr4S3CjSJ4cDzcYaIZUoqLx1NQzT7l1x4na44RA561BaazpbRkzonYaDGsMK6g5KgAKCVfCTJajLwuAplRwKgBxASWke/O5Wwa+BxDDkZF+RDVHfZqIKMJwrl87M2/L2yv/Sp+RVYNiVWyjE7CB0aECTd1xpi4mY91T68SwEPTZwniktQakKd5L8twOYdtcI77n2gysizt9253iO7QAMUU4xdt4Ixo8b2HtA8AfLvn9suLxOYhRshLhZUDXi9Dw81eJik0Y0lETVxvNlMepvUfd8YRcEylq3cHND6BydqkQtDkA22fe9WXfZWBglXQr1cllFjCVVajvuw0DM1++I1mAQ9+YbTVRpbOIAmBPwP6K5CaMrLRdvPumWq1PE7IZp5KDv54bYh47Ns6rhGuw9Wl7MRCrYbQUwTpihwoeSbuC2AK/X1ddrFWU74Ymmy3TmG//+Gc9gsVTZCaSTojmB5R9tAwrq9y39BH/YmvblqEesdWZ/tSor+Tm2AgOkSmhEM8jXA7q0YIwNln3T9aL/UhonD4u50zWtKLZMqOeSuiImnVnlKXxgRtUsAhcmFYqhK7WJe9KgkSpEBWR3k8s6IXY+p+pXK0dvTyiVG8aLYIsC4NAZTc8+x7k/6Bb6FgNXYrere6xQb TKgJxEte GsgATGOWIcIdrnLFwSGK7uIgB36FHc7/Df5iv26U4oxxTuAhhVe58qFrKmL6XekPXe2Vvd+WxUBYnAZryKH14YGiJHGUl2LQ45eGVTvMj6YG7W9bDgvjeKqbveP368PKvCWElSGWh9JEzk8FmlFf3+0djc9pBwLibrVqQizyBDhhW++HwLjVKbWBbazG0qBkfrHBERhg8Y1igKAwyp0As7w7R5b6ZAxSLyy4KG2awvk8zf/lmCn3rmR0YrFXqDlS8iSPJIS3knzbC0hyXBsKtqNj4PKN2jKtEqzQ8jp28RsM7A2ukpmFMj3mBFemMkeDIzo54XDOxoBj2O+f04g52pvP2SHgCy3OX0aUWs+YjcJKcW+tht9yG98dCHCxGLqSqfHfG7K8Sm1cumjfXxtlyf0p2jvvEkS1dEpxiBTZjDUGVEkHcjsYSUx36VQnIITSkKsDcmbjUKLbtE1Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --gCkAclEDVL+x2a5T Content-Type: text/plain; protected-headers=v1; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Date: Mon, 30 Oct 2023 14:30:52 +0100 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= To: Jan Kara Cc: Vlastimil Babka , Mikulas Patocka , Andrew Morton , Matthew Wilcox , Michal Hocko , stable@vger.kernel.org, regressions@lists.linux.dev, Alasdair Kergon , Mike Snitzer , dm-devel@lists.linux.dev, linux-mm@kvack.org Subject: Re: Intermittent storage (dm-crypt?) freeze - regression 6.4->6.5 On Mon, Oct 30, 2023 at 01:25:13PM +0100, Jan Kara wrote: > On Mon 30-10-23 12:30:23, Vlastimil Babka wrote: > > On 10/30/23 12:22, Mikulas Patocka wrote: > > > On Mon, 30 Oct 2023, Vlastimil Babka wrote: > > >=20 > > >> Ah, missed that. And the traces don't show that we would be waiting = for > > >> that. I'm starting to think the allocation itself is really not the = issue > > >> here. Also I don't think it deprives something else of large order p= ages, as > > >> per the sysrq listing they still existed. > > >>=20 > > >> What I rather suspect is what happens next to the allocated bio such= that it > > >> works well with order-0 or up to costly_order pages, but there's some > > >> problem causing a deadlock if the bio contains larger pages than tha= t? > > >=20 > > > Yes. There are many "if (order > PAGE_ALLOC_COSTLY_ORDER)" branches i= n the=20 > > > memory allocation code and I suppose that one of them does something = bad=20 > > > and triggers this bug. But I don't know which one. > >=20 > > It's not what I meant. All the interesting branches for costly order in= page > > allocator/compaction only apply with __GFP_DIRECT_RECLAIM, so we can't = be > > hitting those here. > > The traces I've seen suggest the allocation of the bio suceeded, and > > problems arised only after it was submitted. > >=20 > > I wouldn't even be surprised if the threshold for hitting the bug was n= ot > > exactly order > PAGE_ALLOC_COSTLY_ORDER but order > PAGE_ALLOC_COSTLY_O= RDER > > + 1 or + 2 (has that been tested?) or rather that there's no exact > > threshold, but probability increases with order. >=20 > Well, it would be possible that larger pages in a bio would trip e.g. bio > splitting due to maximum segment size the disk supports (which can be e.g. > 0xffff) and that upsets something somewhere. But this is pure > speculation. We definitely need more debug data to be able to tell more. I can collect more info, but I need some guidance how :) Some patch adding extra debug messages? Note I collect those via serial console (writing to disk doesn't work when it freezes), and that has some limits in the amount of data I can extract especially when printed quickly. For example sysrq-t is too much. Or maybe there is some trick to it, like increasing log_bug_len? --=20 Best Regards, Marek Marczykowski-G=C3=B3recki Invisible Things Lab --gCkAclEDVL+x2a5T Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAmU/sAwACgkQ24/THMrX 1yxwsQf/d2OZw7uGhOAxJWWKmU7V9KAdnggbSzQwtPgo0j2PibTD6nyiIUlSMHqG JXktV2ILonlsP+UnnzJHAVTUqjb8gM7g4+uvvYttZzEptfYojM8l73qOuB1uVYkM 9KHx8i8312uOZqj+XfCVrGzB9zLXogEfcgS2JhS2jAerL9SHiBTIipQleevcaDCm PhFfV6eVTP9vvfaFqPcG/MtfwmJ99gws4FOj42mOyFkBqmQak4fCYdOZjk00LycW WXBWwpHofgFZAjpOgP0AawHbGIUUYYME5rjYhlxjod1zGogMYRMPF/e+UALTh9Jp /hJMOu9WMivsU9IA+oMklZT0rsV8bQ== =4/SH -----END PGP SIGNATURE----- --gCkAclEDVL+x2a5T--