* Re: [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations
2025-01-28 7:52 [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations sooraj
@ 2025-01-28 0:53 ` Andrew Morton
2025-01-28 20:48 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2025-01-28 0:53 UTC (permalink / raw)
To: sooraj; +Cc: linux-mm, Tejun Heo, linux-block
On Tue, 28 Jan 2025 02:52:50 -0500 sooraj <sooraj20636@gmail.com> wrote:
> Ensure cgwb (cgroup writeback) structures are uniquely associated with a
> memcg-blkcg pair to prevent inconsistencies when concurrent cgwb_create
> calls race. This resolves a scenario where two threads creating cgwbs
> for the same memory cgroup (memcg) but different I/O control groups (blkcg)
> could insert conflicting entries.
>
> The fix rechecks for existing cgwbs under the cgwb_lock spinlock after
> initial creation. If a conflicting cgwb (same memcg, different blkcg) is
> found, it is killed before inserting the new entry. This guarantees a
> 1:1 relationship between memcg-blkcg pairs and their cgwbs, preserving
> system invariants.
Thanks.
This looks sensible, but it would be best to bring it to Tejun's attention.
I assume that this race has been observed in the real world? If so,
please fully describe the circumstances under which it occurred, and
describe the userspace-visible effects.
Probably a "Cc: <stable@vger.kernel.org>" is appropriate. And it looks
like the offending code is so old that a Fixes: won't be needed.
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -723,24 +723,39 @@ static int cgwb_create(struct backing_dev_info *bdi,
> spin_lock_irqsave(&cgwb_lock, flags);
> if (test_bit(WB_registered, &bdi->wb.state) &&
> blkcg_cgwb_list->next && memcg_cgwb_list->next) {
> - /* we might have raced another instance of this function */
> - ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb);
> - if (!ret) {
> - list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list);
> - list_add(&wb->memcg_node, memcg_cgwb_list);
> - list_add(&wb->blkcg_node, blkcg_cgwb_list);
> - blkcg_pin_online(blkcg_css);
> - css_get(memcg_css);
> - css_get(blkcg_css);
> + /* Re-check under lock to handle races */
> + struct bdi_writeback *existing;
> +
> + existing = radix_tree_lookup(&bdi->cgwb_tree, memcg_css->id);
> + if (existing) {
> + if (existing->blkcg_css != blkcg_css) {
> + cgwb_kill(existing);
> + existing = NULL;
> + } else {
> + ret = 0; /* Already exists, treat as success */
> + }
> + }
> +
> + if (!existing) {
> + ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb);
> + if (!ret) {
> + list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list);
> + list_add(&wb->memcg_node, memcg_cgwb_list);
> + list_add(&wb->blkcg_node, blkcg_cgwb_list);
> + blkcg_pin_online(blkcg_css);
> + css_get(memcg_css);
> + css_get(blkcg_css);
> + }
> }
> }
> spin_unlock_irqrestore(&cgwb_lock, flags);
> - if (ret) {
> - if (ret == -EEXIST)
> - ret = 0;
> +
> + if (!ret)
> + goto out_put;
> + if (ret == -EEXIST)
> + ret = 0; /* Lost race, another thread created the same wb */
> + else
> goto err_fprop_exit;
> - }
> - goto out_put;
>
> err_fprop_exit:
> bdi_put(bdi);
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations
@ 2025-01-28 7:52 sooraj
2025-01-28 0:53 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: sooraj @ 2025-01-28 7:52 UTC (permalink / raw)
To: linux-mm; +Cc: sooraj
Ensure cgwb (cgroup writeback) structures are uniquely associated with a
memcg-blkcg pair to prevent inconsistencies when concurrent cgwb_create
calls race. This resolves a scenario where two threads creating cgwbs
for the same memory cgroup (memcg) but different I/O control groups (blkcg)
could insert conflicting entries.
The fix rechecks for existing cgwbs under the cgwb_lock spinlock after
initial creation. If a conflicting cgwb (same memcg, different blkcg) is
found, it is killed before inserting the new entry. This guarantees a
1:1 relationship between memcg-blkcg pairs and their cgwbs, preserving
system invariants.
Signed-off-by: sooraj <sooraj20636@gmail.com>
---
mm/backing-dev.c | 43 +++++++++++++++++++++++++++++--------------
1 file changed, 29 insertions(+), 14 deletions(-)
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index e61bbb1bd622..67acb565e9a7 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -723,24 +723,39 @@ static int cgwb_create(struct backing_dev_info *bdi,
spin_lock_irqsave(&cgwb_lock, flags);
if (test_bit(WB_registered, &bdi->wb.state) &&
blkcg_cgwb_list->next && memcg_cgwb_list->next) {
- /* we might have raced another instance of this function */
- ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb);
- if (!ret) {
- list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list);
- list_add(&wb->memcg_node, memcg_cgwb_list);
- list_add(&wb->blkcg_node, blkcg_cgwb_list);
- blkcg_pin_online(blkcg_css);
- css_get(memcg_css);
- css_get(blkcg_css);
+ /* Re-check under lock to handle races */
+ struct bdi_writeback *existing;
+
+ existing = radix_tree_lookup(&bdi->cgwb_tree, memcg_css->id);
+ if (existing) {
+ if (existing->blkcg_css != blkcg_css) {
+ cgwb_kill(existing);
+ existing = NULL;
+ } else {
+ ret = 0; /* Already exists, treat as success */
+ }
+ }
+
+ if (!existing) {
+ ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb);
+ if (!ret) {
+ list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list);
+ list_add(&wb->memcg_node, memcg_cgwb_list);
+ list_add(&wb->blkcg_node, blkcg_cgwb_list);
+ blkcg_pin_online(blkcg_css);
+ css_get(memcg_css);
+ css_get(blkcg_css);
+ }
}
}
spin_unlock_irqrestore(&cgwb_lock, flags);
- if (ret) {
- if (ret == -EEXIST)
- ret = 0;
+
+ if (!ret)
+ goto out_put;
+ if (ret == -EEXIST)
+ ret = 0; /* Lost race, another thread created the same wb */
+ else
goto err_fprop_exit;
- }
- goto out_put;
err_fprop_exit:
bdi_put(bdi);
--
2.45.2
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations
2025-01-28 0:53 ` Andrew Morton
@ 2025-01-28 20:48 ` Tejun Heo
0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2025-01-28 20:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: sooraj, linux-mm, linux-block
Hello,
On Mon, Jan 27, 2025 at 04:53:11PM -0800, Andrew Morton wrote:
> On Tue, 28 Jan 2025 02:52:50 -0500 sooraj <sooraj20636@gmail.com> wrote:
>
> > Ensure cgwb (cgroup writeback) structures are uniquely associated with a
> > memcg-blkcg pair to prevent inconsistencies when concurrent cgwb_create
> > calls race. This resolves a scenario where two threads creating cgwbs
> > for the same memory cgroup (memcg) but different I/O control groups (blkcg)
> > could insert conflicting entries.
> >
> > The fix rechecks for existing cgwbs under the cgwb_lock spinlock after
> > initial creation. If a conflicting cgwb (same memcg, different blkcg) is
> > found, it is killed before inserting the new entry. This guarantees a
> > 1:1 relationship between memcg-blkcg pairs and their cgwbs, preserving
> > system invariants.
I'm a bit confused. Radix tree doesn't allow two entries to be inserted on
the same key and the tree is keyed by memcg_id. Wouldn't that automatically
guarantee 1:1 relationship?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-01-28 20:48 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-01-28 7:52 [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations sooraj
2025-01-28 0:53 ` Andrew Morton
2025-01-28 20:48 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox