From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65755C54E58 for ; Thu, 21 Mar 2024 16:45:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 021E36B00AD; Thu, 21 Mar 2024 12:45:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F14536B00AF; Thu, 21 Mar 2024 12:45:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E04136B00B2; Thu, 21 Mar 2024 12:45:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CFF6C6B00AD for ; Thu, 21 Mar 2024 12:45:45 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A4B3EA2059 for ; Thu, 21 Mar 2024 16:45:45 +0000 (UTC) X-FDA: 81921622650.05.DC6C801 Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) by imf29.hostedemail.com (Postfix) with ESMTP id 2B9AB12001C for ; Thu, 21 Mar 2024 16:45:42 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=hQA5z+6E; spf=pass (imf29.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711039544; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qiPYsh2e6+LabS2rsJkJYZzCxeqSnFFbszkq33AcAYM=; b=4GSmJv5vEbdYZRxLOmASB57DrE8JQDD0YepY+6kgVJLR3/Z3cxZGUeFgbykQBwhPQGYkuv 39tpwvmiKOjx3mlNzObi/teMg8hSZp4tccUEvOOXKN5fcstwmVyVqBNRTYK3usXi0aa7oE RlJI31Ggt5hfiw92VI9xkCCkZt92qm8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711039544; a=rsa-sha256; cv=none; b=fYJ9BSaGYlJhr5RQmlj0hPUQhHg8XVX8/gm0v8m4vP/yHQlcwVfYFDNjFRg+w0Ah8RlDnu Q/U6GBHdBAUegYmvSwaxS8QKsdgxyXdIM1CrOfgxxcPeqNf/1UfZmjOG+fwgTKcDlVwJeE Dj5ry2wU1zHULULK7HCTkg2fwhMJfMk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=hQA5z+6E; spf=pass (imf29.hostedemail.com: domain of kent.overstreet@linux.dev designates 95.215.58.180 as permitted sender) smtp.mailfrom=kent.overstreet@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Date: Thu, 21 Mar 2024 12:45:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711039541; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=qiPYsh2e6+LabS2rsJkJYZzCxeqSnFFbszkq33AcAYM=; b=hQA5z+6E9xPnCJ5Xb7vXjdGe4cm2mQblxZ5bjnJajPD1e6JlfwZp+LGqWk3/aF2ytI6XAw 0u03rCHmvXfaVoRiaSU80/pPCPBG3K0ZrHtlr+gtlg5dfLcIMGaeqGRoYiqpdYgknDNug0 CLR4sOf+fwzvR2r5kEbq2C9csipgHlQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Johannes Weiner Cc: Chengming Zhou , Yosry Ahmed , Nhat Pham , linux-mm@kvack.org Subject: Re: zswap doing io in GFP_NOIO reclaim context Message-ID: References: <9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev> <20240321151757.GC777580@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240321151757.GC777580@cmpxchg.org> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 2B9AB12001C X-Rspam-User: X-Stat-Signature: 6ak3nbeg1o97k8s7qeymj3ihks7z8533 X-Rspamd-Server: rspam03 X-HE-Tag: 1711039542-695617 X-HE-Meta: U2FsdGVkX1/Pwo0n0eiBNAz6KEeAlHHX7tb283AnJbmCCyIrdworxrBpIGJ6fOT8d3umvfjoCCPk0Gv2BOrg+hh7VVysRKKU1losZ6g1HuHdkZGDL1H6JJHCfyBDLcruAPqnZt39LR/G6DOZsbqg16VNEvCtDXLQyBI2BWM52/PgPU25P8yUz6s/4M0zjpT628lpv+PfN82qmdFP3xTkUgwsbVA0tY8z2rUQJNXLIHvCL+Dl36CD1fW7e08EVwP1K0BI8oeMHGMse3EIgWc7vhpMXZlwOjkWHKH6Qevqqob+07jynRPjrLN2wW3orVkKMdKJQ+Btp54IWVjhBqHkBmA4ccnu8Tl0h949r6vhEV0SaxE1Ur1aT/i+x9fYyA+HOa6z2wai3KZENUYU6mlTY8lpqlAxZTApNn5HwuJKqMIZ8mu+n72JPofLY2Uw+Z1GHJKxkClIEgXZwriMra2nzeSo1OSKeXAYwaU4ITj7Du5InWgUVwA4nCO63cpec50h4EIcSmIockORgpKMEsi1IRLKvjZsfv5HP9cg9Kjjgqq8tT32nZM6DftgrLyPvMP50IoAzncl2Hjhuh9KfXpNaX5RcMlW76PZxjULOyPbyjI5DsQ+RAszvZt960+y15HhYRrCx1sd9Fg6vpALxGsik2WfrLxpDVj8xZS5xi7oQcJcebiWfNZV2FjqjT3QqGjsXUeZCivmAF/0Zw+STg61wIY4Lz2+GBOOvjp+ylWnmDnQPHVIRhswUCDrOXhxf0zprD9ckPQOF9EtqaKinQhqvmdrnlQjKTiErUmBgPvevqXiV16Syks7Gwb0pVC11MZP7+MCTk0Xelu+w6Pd3OSoRErrDEhDFTfOcwFwY9UGs9W5dKQw1bwQcFtjbWZblsZb6bw0k/rZJJIauPt3GL/mK3T3E6mWRCEyRAAT65lkErAZry7Y5b6JUKwvjm/a2XiB76zjSxe0cDGOxEUSyLo Q2I9t9Jl iFH7FBjCVLlP02XQX4V+oG8oKHlFnrsyZEdsqDk+L8LiPyxIuKckNycJVDwYzR2Y2ng8i/gMvnGZNn88vT5wDtAiSEsUpqlC+P1szaRJq6j3qfhLB31PTIet9ZA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 21, 2024 at 11:17:57AM -0400, Johannes Weiner wrote: > On Thu, Mar 21, 2024 at 01:16:23PM +0800, Chengming Zhou wrote: > > On 2024/3/21 11:54, Kent Overstreet wrote: > > > just got this bug report, things wildly backed up in bcachefs and do > > > some digging and it looks like zswap is to blame > > > > > > [10264.128242] sysrq: Show Blocked State > > > [10264.128268] task:kworker/20:0H state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000 > > > [10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs] > > > [10264.128295] Call Trace: > > > [10264.128295] > > > [10264.128297] __schedule+0x3e6/0x1520 > > > [10264.128301] ? ttwu_do_activate+0x64/0x200 > > > [10264.128303] schedule+0x32/0xd0 > > > [10264.128304] schedule_timeout+0x98/0x160 > > > [10264.128306] ? __pfx_process_timeout+0x10/0x10 > > > [10264.128308] io_schedule_timeout+0x50/0x80 > > > [10264.128309] wait_for_completion_io_timeout+0x7f/0x180 > > > [10264.128310] submit_bio_wait+0x78/0xb0 > > > [10264.128313] swap_writepage_bdev_sync+0xf6/0x150 > > > [10264.128315] ? __pfx_submit_bio_wait_endio+0x10/0x10 > > > [10264.128317] zswap_writeback_entry+0xf2/0x180 > > > [10264.128319] shrink_memcg_cb+0xe7/0x2f0 > > > [10264.128320] ? xa_load+0x8c/0xe0 > > > [10264.128321] ? __pfx_shrink_memcg_cb+0x10/0x10 > > > [10264.128322] __list_lru_walk_one+0xb9/0x1d0 > > > [10264.128324] ? __pfx_shrink_memcg_cb+0x10/0x10 > > > [10264.128325] list_lru_walk_one+0x5d/0x90 > > > [10264.128326] zswap_shrinker_scan+0xc4/0x130 > > > [10264.128327] do_shrink_slab+0x13f/0x360 > > > [10264.128328] shrink_slab+0x28e/0x3c0 > > > [10264.128329] shrink_one+0x123/0x1b0 > > > [10264.128331] shrink_node+0x97e/0xbc0 > > > [10264.128332] do_try_to_free_pages+0xe7/0x5b0 > > > [10264.128333] try_to_free_pages+0xe1/0x200 > > > [10264.128334] __alloc_pages_slowpath.constprop.0+0x343/0xde0 > > > [10264.128337] __alloc_pages+0x32d/0x350 > > > [10264.128338] allocate_slab+0x400/0x460 > > > [10264.128339] ___slab_alloc+0x40d/0xa40 > > > [10264.128341] ? mempool_alloc+0x86/0x1b0 > > > [10264.128343] ? finish_task_switch.isra.0+0x94/0x2f0 > > > [10264.128345] ? __schedule+0x3ee/0x1520 > > > [10264.128345] kmem_cache_alloc+0x2e7/0x330 > > > [10264.128347] ? mempool_alloc+0x86/0x1b0 > > > [10264.128348] mempool_alloc+0x86/0x1b0 > > > [10264.128349] bio_alloc_bioset+0x200/0x4f0 > > > [10264.128351] ? __queue_work.part.0+0x1a5/0x390 > > > [10264.128352] bio_alloc_clone+0x23/0x60 > > > [10264.128354] alloc_io+0x26/0xf0 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > > > [10264.128361] dm_submit_bio+0xb8/0x580 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > > > [10264.128366] __submit_bio+0xb0/0x170 > > > [10264.128367] submit_bio_noacct_nocheck+0x159/0x370 > > > [10264.128368] bch2_submit_wbio_replicas+0x21c/0x3a0 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > > > [10264.128391] btree_write_submit+0x1cf/0x220 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > > > [10264.128406] process_one_work+0x178/0x350 > > > [10264.128408] worker_thread+0x30f/0x450 > > > [10264.128409] ? __pfx_worker_thread+0x10/0x10 > > > [10264.128409] kthread+0xe5/0x120 > > > > > > dm is using GFP_NOIO for that allocation, so zswap is clearly busted. > > > > You are right, and the shrink_control->gfp_mask is not even used in zswap, > > which would just use GFP_KERNEL in its zswap_writeback_entry(). > > I'm not sure the gfp_mask of the allocation is (fully?) applicable to > the allocation of the swapcache. > > The reclaim-related ones are not. We're already in reclaim and won't > recurse. > > Things like __GFP_THISNODE, __GFP_ACCOUNT are definitely not > applicable to the swapcache allocation on writeback. > > See for reference also the gfp_mask in add_to_swap() -> > add_to_swap_cache() when it's called from reclaim context. > > But the shrinker directly calls __swap_writepage(), which will submit > IO, and may even enter the fs. We definitely have to filter for that: Are you applying the fix? You're listed as maintainer > > diff --git a/mm/zswap.c b/mm/zswap.c > index b31c977f53e9..535c907345e0 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1303,6 +1303,14 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker, > if (!zswap_shrinker_enabled || !mem_cgroup_zswap_writeback_enabled(memcg)) > return 0; > > + /* > + * The shrinker resumes swap writeback, which will enter block > + * and may enter fs. XXX: Harmonize with vmscan.c __GFP_FS > + * rules (may_enter_fs()), which apply on a per-folio basis. > + */ > + if (!gfp_has_io_fs(sc->gfp_mask)) > + return 0; > + > #ifdef CONFIG_MEMCG_KMEM > mem_cgroup_flush_stats(memcg); > nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;