From: Chengming Zhou <chengming.zhou@linux.dev>
To: Kent Overstreet <kent.overstreet@linux.dev>,
Johannes Weiner <hannes@cmpxchg.org>,
Yosry Ahmed <yosryahmed@google.com>,
Nhat Pham <nphamcs@gmail.com>
Cc: linux-mm@kvack.org
Subject: Re: zswap doing io in GFP_NOIO reclaim context
Date: Thu, 21 Mar 2024 13:16:23 +0800 [thread overview]
Message-ID: <9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev> (raw)
In-Reply-To: <rc4pk2r42oyvjo4dc62z6sovquyllq56i5cdgcaqbd7wy3hfzr@n4nbxido3fme>
On 2024/3/21 11:54, Kent Overstreet wrote:
> just got this bug report, things wildly backed up in bcachefs and do
> some digging and it looks like zswap is to blame
>
> [10264.128242] sysrq: Show Blocked State
> [10264.128268] task:kworker/20:0H state:D stack:0 pid:143 tgid:143 ppid:2 flags:0x00004000
> [10264.128271] Workqueue: bcachefs_io btree_write_submit [bcachefs]
> [10264.128295] Call Trace:
> [10264.128295] <TASK>
> [10264.128297] __schedule+0x3e6/0x1520
> [10264.128301] ? ttwu_do_activate+0x64/0x200
> [10264.128303] schedule+0x32/0xd0
> [10264.128304] schedule_timeout+0x98/0x160
> [10264.128306] ? __pfx_process_timeout+0x10/0x10
> [10264.128308] io_schedule_timeout+0x50/0x80
> [10264.128309] wait_for_completion_io_timeout+0x7f/0x180
> [10264.128310] submit_bio_wait+0x78/0xb0
> [10264.128313] swap_writepage_bdev_sync+0xf6/0x150
> [10264.128315] ? __pfx_submit_bio_wait_endio+0x10/0x10
> [10264.128317] zswap_writeback_entry+0xf2/0x180
> [10264.128319] shrink_memcg_cb+0xe7/0x2f0
> [10264.128320] ? xa_load+0x8c/0xe0
> [10264.128321] ? __pfx_shrink_memcg_cb+0x10/0x10
> [10264.128322] __list_lru_walk_one+0xb9/0x1d0
> [10264.128324] ? __pfx_shrink_memcg_cb+0x10/0x10
> [10264.128325] list_lru_walk_one+0x5d/0x90
> [10264.128326] zswap_shrinker_scan+0xc4/0x130
> [10264.128327] do_shrink_slab+0x13f/0x360
> [10264.128328] shrink_slab+0x28e/0x3c0
> [10264.128329] shrink_one+0x123/0x1b0
> [10264.128331] shrink_node+0x97e/0xbc0
> [10264.128332] do_try_to_free_pages+0xe7/0x5b0
> [10264.128333] try_to_free_pages+0xe1/0x200
> [10264.128334] __alloc_pages_slowpath.constprop.0+0x343/0xde0
> [10264.128337] __alloc_pages+0x32d/0x350
> [10264.128338] allocate_slab+0x400/0x460
> [10264.128339] ___slab_alloc+0x40d/0xa40
> [10264.128341] ? mempool_alloc+0x86/0x1b0
> [10264.128343] ? finish_task_switch.isra.0+0x94/0x2f0
> [10264.128345] ? __schedule+0x3ee/0x1520
> [10264.128345] kmem_cache_alloc+0x2e7/0x330
> [10264.128347] ? mempool_alloc+0x86/0x1b0
> [10264.128348] mempool_alloc+0x86/0x1b0
> [10264.128349] bio_alloc_bioset+0x200/0x4f0
> [10264.128351] ? __queue_work.part.0+0x1a5/0x390
> [10264.128352] bio_alloc_clone+0x23/0x60
> [10264.128354] alloc_io+0x26/0xf0 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382]
> [10264.128361] dm_submit_bio+0xb8/0x580 [dm_mod 7e9e6b44df4927f93fb3e4b5c782767396f58382]
> [10264.128366] __submit_bio+0xb0/0x170
> [10264.128367] submit_bio_noacct_nocheck+0x159/0x370
> [10264.128368] bch2_submit_wbio_replicas+0x21c/0x3a0 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2]
> [10264.128391] btree_write_submit+0x1cf/0x220 [bcachefs 85f1b9a7a824f272eff794653a06dde1a94439f2]
> [10264.128406] process_one_work+0x178/0x350
> [10264.128408] worker_thread+0x30f/0x450
> [10264.128409] ? __pfx_worker_thread+0x10/0x10
> [10264.128409] kthread+0xe5/0x120
>
> dm is using GFP_NOIO for that allocation, so zswap is clearly busted.
You are right, and the shrink_control->gfp_mask is not even used in zswap,
which would just use GFP_KERNEL in its zswap_writeback_entry().
>
> We're already under generic_make_request(), so that submit_bio_wait()
> that zswap kicked off is never going to return.
>
> We need to think about how to add some assertions so that we know
> reclaim context is being honoured...
>
Maybe we could put the reclaim context info in the task_struct? So
some assertions can be added in some places.
Thanks.
next prev parent reply other threads:[~2024-03-21 5:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-21 3:54 Kent Overstreet
2024-03-21 5:16 ` Chengming Zhou [this message]
2024-03-21 15:17 ` Johannes Weiner
2024-03-21 16:45 ` Kent Overstreet
2024-03-21 17:35 ` Johannes Weiner
2024-03-21 18:51 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9fe46711-65ca-4818-b5d0-a16d2851d4b1@linux.dev \
--to=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox