From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B456DC54E68 for ; Thu, 21 Mar 2024 05:16:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4A7E36B008A; Thu, 21 Mar 2024 01:16:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 459356B0092; Thu, 21 Mar 2024 01:16:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 347366B0093; Thu, 21 Mar 2024 01:16:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2367A6B008A for ; Thu, 21 Mar 2024 01:16:40 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id CD4DD1205E9 for ; Thu, 21 Mar 2024 05:16:39 +0000 (UTC) X-FDA: 81919886118.30.B6D67EC Received: from out-175.mta0.migadu.com (out-175.mta0.migadu.com [91.218.175.175]) by imf10.hostedemail.com (Postfix) with ESMTP id C1403C000C for ; Thu, 21 Mar 2024 05:16:37 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="RRw/+XON"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.175 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710998198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v5WZ1XPioRCsqCJL+549Ag8zm73jKmJ/b8/QF48xWjc=; b=ejxXuWRMbyA9FqxglYFd/Ux+75ObOpT5n6x2kvBGAddCz946dSHP2oUKZw2BV/a4Hss1Xb sDNT/UJAiPDQYRjM25n3+nb3HD6vZC0yVtASo6zvSFy2fmL4aiNbZtGC1weXed/zI5YEHR lyIgvkG6Wzn9OKhw16W5dNartShF6ZI= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="RRw/+XON"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf10.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.175 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710998198; a=rsa-sha256; cv=none; b=j8cKYB8cjAI/5J8UJc/vinR7K6wSOEq+WWY0SuD8FkTGrajX4gQTY4lzwioSEx+w3jns2M Ur3/AirtsIN3yYpxvjxjHFb+vl36x6x8zJjmuYSnEkcQrXm93wGerMpre5WrQ6G55gFK+t xIfqJoEqUY2PzeWwBUdf4TcjuahCoow= Message-ID: <9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1710998189; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v5WZ1XPioRCsqCJL+549Ag8zm73jKmJ/b8/QF48xWjc=; b=RRw/+XONFEIJOrSnTrflHxvZeyvG7l3xXaYkYDFdfvdK3Y6y5WtkCH9krW/TiQLHRrHAUg OzM97js8fYiaQ8dPQtDdiNkoAXoLm8pVCtHeMgy5ksRwd6VQwNkQrJ7pj1eQQ3ml+6PAtl FqlFvTFH+vrrsdYgU2abXc3CXEjlRVk= Date: Thu, 21 Mar 2024 13:16:23 +0800 MIME-Version: 1.0 Subject: Re: zswap doing io in GFP_NOIO reclaim context Content-Language: en-US To: Kent Overstreet , Johannes Weiner , Yosry Ahmed , Nhat Pham Cc: linux-mm@kvack.org References: X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C1403C000C X-Stat-Signature: c5e7kykmb7dxgw1ir5s8sg4zfsstjcir X-HE-Tag: 1710998197-626276 X-HE-Meta: U2FsdGVkX18Mh9vHTzMzoLuHD1JfMSaJXHW9P6ClqAQOFxkCUFuiNrHwPaE9KlCr0xOB1KXVsBR3S8uukmwxbWpIrNDaGotTs1DI0bq6HVw2fJDeuw1XjJs590uzfgosK5aGM9D1sjHl3eWodswpEGGqXwSHlSFKUwpv07qjn8HWpAusmUxCpH6bYmKUwt2JYaANDwcIv4yFQapvwC/48VMs1ROmcYsIwwd7jPnRlEqEIUbq1qseb6NL16HIYArTnFg3YXJwodKEWYQ8tLcVZpDOUfMqPxYKYyI9FPMbyg+hJZXw4rzUHBVixKyHdhoNtVuN5r5qb776gZGFwAl9NnwksuAZ84M0QfsKyDwnRTA4FV4v9JKGydZ32Kdyesaio7zprW4n7toY5tm1UsgI+Fm5cFKunWEvC/m+7YRk9K3WgwA3HQceiZWIjIKRx2hHLtpEd3buy9CHhYl2ZUd7ZueybyipkYXekBPFQrYsBWZEdmygd+pO09IwangsQZlXpwcYEvucL4w7UbYweHlQm+qq4nYl7T8mVbHFf2/6bXDDL18D02O8QXmmwMOoEVuwp547pj2SXa/e+A3nmE+KRLc5gYa/nIdjF8hBS9T+sN9Bq1SF5fr7hbL4glrmJT7pwJQZ1+hPuSi/BcsrZnEbpo5G+JcXUeM/f0fDYgpQE+BSxcWCKVzJaTPdU5SQwgm/Qi9PiOXSjelzkq3fHZCuoroBerYqitRaY/4ni2kxkOXJ5opCD1FBE7tFvBmeszjHe+GHLji8aAHFTLiHA8vjJ4tfj7VeJJQmv+Rp+ZJVwFBzeuEsSKwswcQQFQNVLTr5wKfyjTUUCJYvD/WKIuIQJSMPo31BmJISqoi5Kmtz8Ms5AAeKeuyIAn2YKN4sYO2wUWC4T6uRJRrDpaPi17vgsfVi+OpKt4BB1kY8OWXeuj4tJ+k1kU5q8Erz3gB6l1u587yz3eMuIm38JNbycAv 4w0sjjPr BJBRGPcwj+DtrIiZpv1ND8CK0X23Nx1wfZIczClc1ThiyEGDZF6IF4XoE8CmkcDE4qq1QnqyHX3zYBTF/gOJ+QJixwxiOcs2os6SQSxY3EOXHqt7XNPQm5n3O0kJj0G+fU0/V X-Bogosity: Ham, tests=bogofilter, spamicity=0.004173, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/3/21 11:54, Kent Overstreet wrote: > just got this bug report, things wildly backed up in bcachefs and do > some digging and it looks like zswap is to blame > > [10264.128242] sysrq: Show Blocked State > [10264.128268] task:kworker/20:0H state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000 > [10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs] > [10264.128295] Call Trace: > [10264.128295] > [10264.128297] __schedule+0x3e6/0x1520 > [10264.128301] ? ttwu_do_activate+0x64/0x200 > [10264.128303] schedule+0x32/0xd0 > [10264.128304] schedule_timeout+0x98/0x160 > [10264.128306] ? __pfx_process_timeout+0x10/0x10 > [10264.128308] io_schedule_timeout+0x50/0x80 > [10264.128309] wait_for_completion_io_timeout+0x7f/0x180 > [10264.128310] submit_bio_wait+0x78/0xb0 > [10264.128313] swap_writepage_bdev_sync+0xf6/0x150 > [10264.128315] ? __pfx_submit_bio_wait_endio+0x10/0x10 > [10264.128317] zswap_writeback_entry+0xf2/0x180 > [10264.128319] shrink_memcg_cb+0xe7/0x2f0 > [10264.128320] ? xa_load+0x8c/0xe0 > [10264.128321] ? __pfx_shrink_memcg_cb+0x10/0x10 > [10264.128322] __list_lru_walk_one+0xb9/0x1d0 > [10264.128324] ? __pfx_shrink_memcg_cb+0x10/0x10 > [10264.128325] list_lru_walk_one+0x5d/0x90 > [10264.128326] zswap_shrinker_scan+0xc4/0x130 > [10264.128327] do_shrink_slab+0x13f/0x360 > [10264.128328] shrink_slab+0x28e/0x3c0 > [10264.128329] shrink_one+0x123/0x1b0 > [10264.128331] shrink_node+0x97e/0xbc0 > [10264.128332] do_try_to_free_pages+0xe7/0x5b0 > [10264.128333] try_to_free_pages+0xe1/0x200 > [10264.128334] __alloc_pages_slowpath.constprop.0+0x343/0xde0 > [10264.128337] __alloc_pages+0x32d/0x350 > [10264.128338] allocate_slab+0x400/0x460 > [10264.128339] ___slab_alloc+0x40d/0xa40 > [10264.128341] ? mempool_alloc+0x86/0x1b0 > [10264.128343] ? finish_task_switch.isra.0+0x94/0x2f0 > [10264.128345] ? __schedule+0x3ee/0x1520 > [10264.128345] kmem_cache_alloc+0x2e7/0x330 > [10264.128347] ? mempool_alloc+0x86/0x1b0 > [10264.128348] mempool_alloc+0x86/0x1b0 > [10264.128349] bio_alloc_bioset+0x200/0x4f0 > [10264.128351] ? __queue_work.part.0+0x1a5/0x390 > [10264.128352] bio_alloc_clone+0x23/0x60 > [10264.128354] alloc_io+0x26/0xf0 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > [10264.128361] dm_submit_bio+0xb8/0x580 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > [10264.128366] __submit_bio+0xb0/0x170 > [10264.128367] submit_bio_noacct_nocheck+0x159/0x370 > [10264.128368] bch2_submit_wbio_replicas+0x21c/0x3a0 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > [10264.128391] btree_write_submit+0x1cf/0x220 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > [10264.128406] process_one_work+0x178/0x350 > [10264.128408] worker_thread+0x30f/0x450 > [10264.128409] ? __pfx_worker_thread+0x10/0x10 > [10264.128409] kthread+0xe5/0x120 > > dm is using GFP_NOIO for that allocation, so zswap is clearly busted. You are right, and the shrink_control->gfp_mask is not even used in zswap, which would just use GFP_KERNEL in its zswap_writeback_entry(). > > We're already under generic_make_request(), so that submit_bio_wait() > that zswap kicked off is never going to return. > > We need to think about how to add some assertions so that we know > reclaim context is being honoured... > Maybe we could put the reclaim context info in the task_struct? So some assertions can be added in some places. Thanks.