From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0223C02188 for ; Tue, 28 Jan 2025 00:53:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 31BFA2801EB; Mon, 27 Jan 2025 19:53:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2CC2A2801E4; Mon, 27 Jan 2025 19:53:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 195BB2801EB; Mon, 27 Jan 2025 19:53:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EE67D2801E4 for ; Mon, 27 Jan 2025 19:53:15 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 78F9F1A094E for ; Tue, 28 Jan 2025 00:53:15 +0000 (UTC) X-FDA: 83055036750.05.322B5D5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf18.hostedemail.com (Postfix) with ESMTP id BFD051C000D for ; Tue, 28 Jan 2025 00:53:13 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PsUE3hI+; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738025593; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0OzFjWBHCVJtvSfIPHoeIbgoL9OD41RNsIlfg+EyUUw=; b=c0G6CzbcAiAvz6PvNWCDfE8zNq2fB0FnL9xFQC6cgacwPj4XO2B0WWSL3HgOC2fEcPIymj O8weroQNshEAjARxi/TmHpv0UXS4JJb8TRoiDnvt6qlJiOAS+KZ4q+bgMx2qIMrHts7NDd oI4Bx3CYZAivLUfdUxSCfv7EiLLxFRQ= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=PsUE3hI+; spf=pass (imf18.hostedemail.com: domain of akpm@linux-foundation.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738025593; a=rsa-sha256; cv=none; b=RqMFxOSNDmatjF1IiZQLkOb4GCHNVXEW/OLCIRKLQL0PjqusUqj3uEobPcC8Ns3Q9aNqpS S4koyZc0RlUP0VHnHAAvFMAWxZOpc3AuLlUnfP1FXwZ8zcTXp5cqRbIwCtnOfOzO6Ybk/2 /mqp4ttaSxk63nBtWSD7i6qP9SmUYp0= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 71E5B5C5999; Tue, 28 Jan 2025 00:52:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1B665C4CED2; Tue, 28 Jan 2025 00:53:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1738025592; bh=V8hCmqSu648rjGj9uef2mOgFY3g/hPMBNN/V/Gn2vkg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=PsUE3hI+T7bp7NhvLcEykdZQ8RJk4lCsCeT9XhOeZUT5LHHMC2yAV4/rdLc/i+l6j T6rvaDuvbUlzDUBv1xLHbWbWHl+eAmDcvnzcIxuQsQ5tcUY/LL8zDtlYzw+MxZu3vx o7LXMuDnFwQ+fiDNw0hKtQY+KBV9rD8+UrIZ166s= Date: Mon, 27 Jan 2025 16:53:11 -0800 From: Andrew Morton To: sooraj Cc: linux-mm@kvack.org, Tejun Heo , linux-block@vger.kernel.org Subject: Re: [PATCH] mm/bdi: fix race between cgwb_create and conflicting blkcg associations Message-Id: <20250127165311.f51b98290b548aff1df92a81@linux-foundation.org> In-Reply-To: <20250128075250.11500-1-sooraj20636@gmail.com> References: <20250128075250.11500-1-sooraj20636@gmail.com> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: BFD051C000D X-Stat-Signature: mgj93tnn65dbntbxb8e8px1xqawujcim X-Rspam-User: X-HE-Tag: 1738025593-583009 X-HE-Meta: U2FsdGVkX18XmwgRqkYyHZajC6kAzuGwSa7zqVOc7XpqUFbvynrge3NNeW8KqsWRHKExpw2DFdl++I5sBa1+OOw+NQXACSkbcoSQPyMqDS51pExRTEIt/Hnvz5EbwSu7Rq/MrMTNSUkW8phwqEBg7cl+5uOEW08GVOT1yyLugLbKHtsWHa6mAq3TwWFLkZRoioPpvUIoy74DEWTxx7bDeQhuSxNcJN816fUxYC1do73JYY8DhzqarJu4+VBZgzWAPfVnAd4mEpa+piw6hwQ7btdf7UMXOZkmSvnKIFYen4EdcJslbbWSnHOuT40XmF3v+5XJCOeUJISaA54fm4N2MenNkUUDzYsIfcAqE0fQXutwnGLWo0U7fguZLxNET9phw+fG2HUgpgXM61PoBUTaFMpIlSHUe2pjpZTEFItIqj8Fdwy0e4xaXZkqsGqONSRyAgitHjBMfOxFl+lzQCXlFIRiuUX0q50u120+kKrC8ppA4fiIq9SXpu4C0JC/3WGRi0MrZTayFXf9wgizW2NtKHBTxOMqM66/OKhv08G1nVC/O4y7AabggEdmRASdEWyk45wW1RlMA1BLzof7bfdmbGORGndzVyep6NoP2PYD9iR7TdACbCAlvTxXTs/+4/BsSO/lZf+nPkP+9U/9SPxw77QGhNNRr9luecdsm4XXKVmVcJTlAIxVQ4odtuTvOOy3qtHqjjCAUXwl58VUE84I/2T9+GX4cdOqXyvO+KxAG4J1Bs1JgUImxvMr0tmeJwNgePyki/eRppnn+0ku8JxDwPxQy5cDFWkEpYCnp8ikdk6CvPtlMRlGqFautkFUPw7RtrcPjbN1MncCBZTXoJMYSHVoq+351L7Ab5yHhNyzzsquKzd4xjOdqQvx57jAFcjAaLlTAIuH7a0gUUymSrjolNsr8OmlSvDr2KN/CwTvnGpSldOzWFDTBwv7kUO71733t/pf9DvOnn0v3jXC953 Bzi4uCDT xSdyND8B3bxRMLeag29W2RMBEUSVxObbLzwIxbCu0uZq+xN8Hp3FMr7oaAVSD6+XU2A5EOrz7nnA202u2ihau8pgN8vpqwhnSoT2NL7aHGDy0qswv1KmIyKRT7Oa5mlR3OEGZErPf/vLAGJAijAeZn9xhnfeE88NlHhUna0iINaOj5FW3xwbZEs9bcnYViCNmxZOtO8h2CEbKroEOLVlixuh+DXE6JLdTKCCYTdc3GE0Ar7DwXh3JNQ9G+AnCNWExBYhIAWirDBve1cZgYlOb60Vkz3OFSrnrWrATdAFLD+r112KJKA4shNxQ9L4NChdSHSGPUFT4xPEjrPoxs6MCG1/sEQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 28 Jan 2025 02:52:50 -0500 sooraj wrote: > Ensure cgwb (cgroup writeback) structures are uniquely associated with a > memcg-blkcg pair to prevent inconsistencies when concurrent cgwb_create > calls race. This resolves a scenario where two threads creating cgwbs > for the same memory cgroup (memcg) but different I/O control groups (blkcg) > could insert conflicting entries. > > The fix rechecks for existing cgwbs under the cgwb_lock spinlock after > initial creation. If a conflicting cgwb (same memcg, different blkcg) is > found, it is killed before inserting the new entry. This guarantees a > 1:1 relationship between memcg-blkcg pairs and their cgwbs, preserving > system invariants. Thanks. This looks sensible, but it would be best to bring it to Tejun's attention. I assume that this race has been observed in the real world? If so, please fully describe the circumstances under which it occurred, and describe the userspace-visible effects. Probably a "Cc: " is appropriate. And it looks like the offending code is so old that a Fixes: won't be needed. > --- a/mm/backing-dev.c > +++ b/mm/backing-dev.c > @@ -723,24 +723,39 @@ static int cgwb_create(struct backing_dev_info *bdi, > spin_lock_irqsave(&cgwb_lock, flags); > if (test_bit(WB_registered, &bdi->wb.state) && > blkcg_cgwb_list->next && memcg_cgwb_list->next) { > - /* we might have raced another instance of this function */ > - ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb); > - if (!ret) { > - list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list); > - list_add(&wb->memcg_node, memcg_cgwb_list); > - list_add(&wb->blkcg_node, blkcg_cgwb_list); > - blkcg_pin_online(blkcg_css); > - css_get(memcg_css); > - css_get(blkcg_css); > + /* Re-check under lock to handle races */ > + struct bdi_writeback *existing; > + > + existing = radix_tree_lookup(&bdi->cgwb_tree, memcg_css->id); > + if (existing) { > + if (existing->blkcg_css != blkcg_css) { > + cgwb_kill(existing); > + existing = NULL; > + } else { > + ret = 0; /* Already exists, treat as success */ > + } > + } > + > + if (!existing) { > + ret = radix_tree_insert(&bdi->cgwb_tree, memcg_css->id, wb); > + if (!ret) { > + list_add_tail_rcu(&wb->bdi_node, &bdi->wb_list); > + list_add(&wb->memcg_node, memcg_cgwb_list); > + list_add(&wb->blkcg_node, blkcg_cgwb_list); > + blkcg_pin_online(blkcg_css); > + css_get(memcg_css); > + css_get(blkcg_css); > + } > } > } > spin_unlock_irqrestore(&cgwb_lock, flags); > - if (ret) { > - if (ret == -EEXIST) > - ret = 0; > + > + if (!ret) > + goto out_put; > + if (ret == -EEXIST) > + ret = 0; /* Lost race, another thread created the same wb */ > + else > goto err_fprop_exit; > - } > - goto out_put; > > err_fprop_exit: > bdi_put(bdi);