From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAFABC54E68 for ; Thu, 21 Mar 2024 15:18:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29BD76B007B; Thu, 21 Mar 2024 11:18:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2238F6B0082; Thu, 21 Mar 2024 11:18:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C3AE6B0085; Thu, 21 Mar 2024 11:18:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id EAD726B007B for ; Thu, 21 Mar 2024 11:18:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9444AA2017 for ; Thu, 21 Mar 2024 15:18:26 +0000 (UTC) X-FDA: 81921402612.12.C556849 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) by imf05.hostedemail.com (Postfix) with ESMTP id 5F1E510002D for ; Thu, 21 Mar 2024 15:18:24 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=LK65ozBr; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711034304; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WKr39Nt/AiqNznjv+90jkvWmONF8jIGj+NX1tEgb4O8=; b=ge3R2sf0GoPHgNMJxQbinCkTetXhOpi/IN+FOeHDzTA47VcBAU0P4tspmbtWgh6g6ESoy0 36S5YPsdK5tR9xn96j2NAI0IyYq8wKKAuSSWBBcvnFuYNhYbJSniSdJYxE2EEbSUHvGmbA Kq3JYbtUr6U1Zx5o3bAOKaESD/xqorE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=LK65ozBr; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711034304; a=rsa-sha256; cv=none; b=Xc7WNdPvhU01JJoyX7ZXsrW0OENAaKXWfY7UBf/O7GJroa2oNMHmWTCii5vdMBwivgCQSD VLl7cS5MP8rZlmX5GDKmfnmt+7/ROEUcMBi6KfKf+cbTlql/l+O8dCtm+p/GG+toQgOk+C A49l0BPXHYaQP7Tx8IYbj54vWW2SCVc= Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-789f00aba19so79998785a.0 for ; Thu, 21 Mar 2024 08:18:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1711034303; x=1711639103; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=WKr39Nt/AiqNznjv+90jkvWmONF8jIGj+NX1tEgb4O8=; b=LK65ozBrFmvUPwOXHlm/Q86ddlTN6aR5+6jjWBVLn4HiBYqmzzaQQU3qsYyRUNE5od k9FVR21FEdUkDwoeGdKBfPw71KdJftEP7+ftUEI8zTUAwxs3Vdd4ng8S4muKvoF17f3k VZn2nvszv9YRfPnuyVMXx3Gs9zfd8QQDaal0vDhPFwfguoHScJ5jyn9jXbrtXGzYqdsc FIQATRPmYw02RFWc9vLJhzpNOs2zuIJO4A7HsKnEEOfwuXFFshdLPN+oOfp+YXQeBq56 Hpf5uMwStzmSoVhovEje+fu2OTIiz7Ar4XyPpdTMR84VxMlgb9MjOPGSU2vyZNdZo1D1 L3Vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711034303; x=1711639103; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WKr39Nt/AiqNznjv+90jkvWmONF8jIGj+NX1tEgb4O8=; b=qqRI0RykH5ipLvkip2KJE9unqIWOVv6f5uHI/pcsXdouBwdezN4aoOE8sXpmwAmtnL iGX0coSIftEIYgv7RQXebHmvIZUiNwE7LKOp6NidXzSt+nVAVTTQFHhqyzmrNo2+tBik FlxReWEKb/TWbvxSUMwnZ0AGxrFlTsStq2XvJCN+LHpmpBctct0YaEzRWl1+qbQ6ZEAS 3+E3PBN83getuUXek5zIg+gJu5TMHDHvKv/4V9rdhhI9aeA/7MV7lqHiLs4hrfswSwGz AZExZsm2lXP4Gy0flDG7RZhS3yJkN945IsEr1yPl91JU5zMT6TBLKtQGemUsZ4A2mIS/ rIfQ== X-Forwarded-Encrypted: i=1; AJvYcCUTBgx3ltgKFLMs+FDAMo/hClMVwS6Bc8i8+MkyRQodz/klrTUojXeiq5YEoj16pwxJ+SlLBfBhVUB3vd9WZEzwqa0= X-Gm-Message-State: AOJu0Yzi6Ccer2Q6OsCzikIbBe8GsZEkc4pB6cYL2h3xEWlNQuZhEK8+ T3nSXSpTo3PXVPv61tac4XQfKXuYGAoet1IbZBbb5236/WBRfi7OZ55EEMXn80k= X-Google-Smtp-Source: AGHT+IH253fQWd+YdTVryUC+QNQQAtN7MRgd5TwLntFXOB+EXN4M8BpdY87G2WNF9scPQyaDUhAMwg== X-Received: by 2002:a05:6214:d8a:b0:691:1cde:e19f with SMTP id e10-20020a0562140d8a00b006911cdee19fmr2624707qve.17.1711034303216; Thu, 21 Mar 2024 08:18:23 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-da5e-d3ff-fee7-26e7.res6.spectrum.com. [2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id k5-20020a0cf585000000b006915b8b37a0sm9186576qvm.55.2024.03.21.08.18.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 08:18:16 -0700 (PDT) Date: Thu, 21 Mar 2024 11:17:57 -0400 From: Johannes Weiner To: Chengming Zhou Cc: Kent Overstreet , Yosry Ahmed , Nhat Pham , linux-mm@kvack.org Subject: Re: zswap doing io in GFP_NOIO reclaim context Message-ID: <20240321151757.GC777580@cmpxchg.org> References: <9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev> X-Rspam-User: X-Stat-Signature: xsym4kpir6scjmp8gac73pznxbc36edq X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5F1E510002D X-HE-Tag: 1711034304-334265 X-HE-Meta: U2FsdGVkX1+/3yVzaLApdbn0O7TkpIfT7soKZ8rPmbEEa+fd00bITyMJnVmfaeEVhqPA5R9kX4A4bp8NSBPvqE+84U7kPEE4/OgCNtwuvrPZFexMxoJSs3WDjsF3kqKHP3QcozAYXxBberut0NRw8RMiYbWLOk5sOmJWbOy50EbeXllF1zLDrNxlTgGxl3BsW98ugCr7oDNiuvZxRDjXDCrrpNBGw9l0AoKgfpwfB0THpitrtH1aG1Kdsi6fCLd7jaQF02n1aC9/ZaRAaY8M6TjtFF5RYYS/B+5t3Iw6Mo/XoI8qsp+4fo2wxcUtCU6gM6bQtoomTI3nXm0RrVOvlFO3Moia9L3budQ3qTDbYG5GEkQkdcGcU07xi0ayxWBBzwth/hZPYU31OxGkNsOwGB8JbzDyND2ovnqVTqGZHTXiKkmFvhvtMyBX+GpmuYBc9IDXYqY9RoLlunvVkWiWZw7HzyG6ituGnwBsqUz+spRhVvcG35xDicnBtDbySxHyHTgSgWrwy2Ftjv73MukW1ATq0GvRby1513DA3Y8pPfjO3J4Rg5pK8urDQI00g4hHdNMYL+OLT3qzCq4HwUZJH6KBFswmyiNTu5Q1EQnyvxs4m9hSCDmvCIOe6vxW2VAKFHKDRLoFSBFESsj/jIzFT3EAGCOvGMKqzU3EcLJZj1tGKf/wJ2Kv9+Uq/gew6yzy0FqLb+Piq7ySWzBcBhHupn12SWRPWUMPAIdT/RX393vl5CMPv47+RkQy8b2MnBqIiOtvTCs5i2ixPVQwcgvdjyHsJ89EkniACHHjjddUKwBEe6DYDo6Wbm3UxqurjFH/OKk1jNh9uNqFL9V7m8F389ch8opxoOrk1reDyD9AYWnfTpUzeSubdR8heDiGMx/Siaq5CIDVwkwjHuE9kJD381+AQHRA0zOsENQ1nRXPCf/RvdtKp+tBa6J5wIsRm0QZQa+VzgiMfXftqEw81xZ mNWafeEG 17SiTD9EfpHj6teN9JMK8arrgEYvlZcwlVMXg61DQPmWRoibp5KVQWUe9hipsMSk7XnzwPhjI7nYS2Sl/lZoP89y5MV00OOvVmHMaAoooDAbqQ/F9KvxRQD6Aw9WlTKNZXuM6PleAaStXOVbr/xTyAAhgcS6AoWUCZIDbtkq5lhG2m2ylVKXyBi/idUubVshH4VkziyFjUlmmxPfftRysHAj/Bk4gv4oS9Go/NwsVYmx8MikVGFcDuhw8ouFJxf6xoKX/9/gwXzXO2ULe3zUD0cYRLduRFN9Mr7Uv8P9H42m294MXPcH0pXdo9WSVFxjx9yRQQwiiU2OM2xYaLQELzpD04WizB7YzfEWhIa43d4zIwZkdwMDfPrRgDgZSd0OuELBBeLtYjxxq+5h83EXmo6Ua8QbJhPsmo5y+Y1Xp6nPXh5TkYwi3eaFcApTxz25vy06y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000667, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 21, 2024 at 01:16:23PM +0800, Chengming Zhou wrote: > On 2024/3/21 11:54, Kent Overstreet wrote: > > just got this bug report, things wildly backed up in bcachefs and do > > some digging and it looks like zswap is to blame > > > > [10264.128242] sysrq: Show Blocked State > > [10264.128268] task:kworker/20:0H state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000 > > [10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs] > > [10264.128295] Call Trace: > > [10264.128295] > > [10264.128297] __schedule+0x3e6/0x1520 > > [10264.128301] ? ttwu_do_activate+0x64/0x200 > > [10264.128303] schedule+0x32/0xd0 > > [10264.128304] schedule_timeout+0x98/0x160 > > [10264.128306] ? __pfx_process_timeout+0x10/0x10 > > [10264.128308] io_schedule_timeout+0x50/0x80 > > [10264.128309] wait_for_completion_io_timeout+0x7f/0x180 > > [10264.128310] submit_bio_wait+0x78/0xb0 > > [10264.128313] swap_writepage_bdev_sync+0xf6/0x150 > > [10264.128315] ? __pfx_submit_bio_wait_endio+0x10/0x10 > > [10264.128317] zswap_writeback_entry+0xf2/0x180 > > [10264.128319] shrink_memcg_cb+0xe7/0x2f0 > > [10264.128320] ? xa_load+0x8c/0xe0 > > [10264.128321] ? __pfx_shrink_memcg_cb+0x10/0x10 > > [10264.128322] __list_lru_walk_one+0xb9/0x1d0 > > [10264.128324] ? __pfx_shrink_memcg_cb+0x10/0x10 > > [10264.128325] list_lru_walk_one+0x5d/0x90 > > [10264.128326] zswap_shrinker_scan+0xc4/0x130 > > [10264.128327] do_shrink_slab+0x13f/0x360 > > [10264.128328] shrink_slab+0x28e/0x3c0 > > [10264.128329] shrink_one+0x123/0x1b0 > > [10264.128331] shrink_node+0x97e/0xbc0 > > [10264.128332] do_try_to_free_pages+0xe7/0x5b0 > > [10264.128333] try_to_free_pages+0xe1/0x200 > > [10264.128334] __alloc_pages_slowpath.constprop.0+0x343/0xde0 > > [10264.128337] __alloc_pages+0x32d/0x350 > > [10264.128338] allocate_slab+0x400/0x460 > > [10264.128339] ___slab_alloc+0x40d/0xa40 > > [10264.128341] ? mempool_alloc+0x86/0x1b0 > > [10264.128343] ? finish_task_switch.isra.0+0x94/0x2f0 > > [10264.128345] ? __schedule+0x3ee/0x1520 > > [10264.128345] kmem_cache_alloc+0x2e7/0x330 > > [10264.128347] ? mempool_alloc+0x86/0x1b0 > > [10264.128348] mempool_alloc+0x86/0x1b0 > > [10264.128349] bio_alloc_bioset+0x200/0x4f0 > > [10264.128351] ? __queue_work.part.0+0x1a5/0x390 > > [10264.128352] bio_alloc_clone+0x23/0x60 > > [10264.128354] alloc_io+0x26/0xf0 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > > [10264.128361] dm_submit_bio+0xb8/0x580 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382] > > [10264.128366] __submit_bio+0xb0/0x170 > > [10264.128367] submit_bio_noacct_nocheck+0x159/0x370 > > [10264.128368] bch2_submit_wbio_replicas+0x21c/0x3a0 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > > [10264.128391] btree_write_submit+0x1cf/0x220 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2] > > [10264.128406] process_one_work+0x178/0x350 > > [10264.128408] worker_thread+0x30f/0x450 > > [10264.128409] ? __pfx_worker_thread+0x10/0x10 > > [10264.128409] kthread+0xe5/0x120 > > > > dm is using GFP_NOIO for that allocation, so zswap is clearly busted. > > You are right, and the shrink_control->gfp_mask is not even used in zswap, > which would just use GFP_KERNEL in its zswap_writeback_entry(). I'm not sure the gfp_mask of the allocation is (fully?) applicable to the allocation of the swapcache. The reclaim-related ones are not. We're already in reclaim and won't recurse. Things like __GFP_THISNODE, __GFP_ACCOUNT are definitely not applicable to the swapcache allocation on writeback. See for reference also the gfp_mask in add_to_swap() -> add_to_swap_cache() when it's called from reclaim context. But the shrinker directly calls __swap_writepage(), which will submit IO, and may even enter the fs. We definitely have to filter for that: diff --git a/mm/zswap.c b/mm/zswap.c index b31c977f53e9..535c907345e0 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1303,6 +1303,14 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker, if (!zswap_shrinker_enabled || !mem_cgroup_zswap_writeback_enabled(memcg)) return 0; + /* + * The shrinker resumes swap writeback, which will enter block + * and may enter fs. XXX: Harmonize with vmscan.c __GFP_FS + * rules (may_enter_fs()), which apply on a per-folio basis. + */ + if (!gfp_has_io_fs(sc->gfp_mask)) + return 0; + #ifdef CONFIG_MEMCG_KMEM mem_cgroup_flush_stats(memcg); nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;