From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66A86C64EC4 for ; Thu, 2 Feb 2023 04:16:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC2176B0071; Wed, 1 Feb 2023 23:16:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A72226B0072; Wed, 1 Feb 2023 23:16:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 939916B0073; Wed, 1 Feb 2023 23:16:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 837056B0071 for ; Wed, 1 Feb 2023 23:16:13 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4D7441201FC for ; Thu, 2 Feb 2023 04:16:13 +0000 (UTC) X-FDA: 80421039426.10.E86A103 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id 4EAB6140007 for ; Thu, 2 Feb 2023 04:16:11 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f9KOh7U5; spf=pass (imf26.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675311371; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0Evaq+uKsw2JRl30GbWwAirFdc+12RiKunGeodqBwPs=; b=NUosCfva8r7LHmwIpNjlIKrLRl8YRGs8vxB23YFTYh6X2b2G/8PWwCEVOzRjd9014atOZV K7KXWnA0zhm3oW8jER/RGKVmr0AJOASSlfJRMocJ11o82R6VaUB66YM+F/Dv4N7adyW2Ve WxsW8jabTxWc5zk5hEh1k/9Iy3RMMI0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=f9KOh7U5; spf=pass (imf26.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675311371; a=rsa-sha256; cv=none; b=PVNpDk63KoJwrw74nnxdU4Vo/zBb9HvRzBMXHBtbrTWkZ5kA+MS82CJseYH0TSfW7AvbGn G62F9HT8THqAi52V4ZB8zy2Vr+1lfPYpv6CtzDALVYaAdI3CX3mgt8+FM22y3h482rPL9W p4/AedbonBX9VCyKpq5GWOz2YUczKKs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675311370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=0Evaq+uKsw2JRl30GbWwAirFdc+12RiKunGeodqBwPs=; b=f9KOh7U5KNO9sxhmoFUSwqcQkT3LKhlscMA6qKmZ86vf1MZQRJLDj4eDMrWxMUpGnSnyKL fphp+BIMvSnyQtddMmcCq0xhOAtwgfl2GawyyDvHKF3/ztqLdQsLUXMqWEvPCGs+3PRY5x IgJfClZyjc89EChUhEaLDD1+NnQlHBA= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-180-PDaMvEVqNfKTVNph2AOpwQ-1; Wed, 01 Feb 2023 23:16:07 -0500 X-MC-Unique: PDaMvEVqNfKTVNph2AOpwQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 27A7F800050; Thu, 2 Feb 2023 04:16:06 +0000 (UTC) Received: from T590 (ovpn-8-25.pek2.redhat.com [10.72.8.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C7E90140EBF4; Thu, 2 Feb 2023 04:15:57 +0000 (UTC) Date: Thu, 2 Feb 2023 12:15:52 +0800 From: Ming Lei To: Waiman Long Cc: Jens Axboe , Tejun Heo , Josef Bacik , Zefan Li , Johannes Weiner , Andrew Morton , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michal =?iso-8859-1?Q?Koutn=FD?= , "Dennis Zhou (Facebook)" , ming.lei@redhat.com Subject: Re: [PATCH v4 2/2] blk-cgroup: Flush stats at blkgs destruction path Message-ID: References: <20221215033132.230023-1-longman@redhat.com> <20221215033132.230023-3-longman@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221215033132.230023-3-longman@redhat.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Stat-Signature: uyei8xaa5etpcxets14xw5u9ag6xioq9 X-Rspam-User: X-Rspamd-Queue-Id: 4EAB6140007 X-Rspamd-Server: rspam06 X-HE-Tag: 1675311371-100734 X-HE-Meta: U2FsdGVkX18AKaweoBqsj5Bi6YtdElJIkfGIXIdz9N6jEob9J6ht7ydzMycyQrPuR2tuclM861YZP9zcPW9BP3RdSbWYxHWwjOdqUlUcOzWJSjw499mr+3rMwsV9xHSTe0hiOyQniml00OxPv8+onH+oWp/FlyuRLvjAXpNQfSI28W6szwZk+vPXmCd9bhLXrClNo3IYEEs3NWONpIRNGO1u3DDqychYqtNLQgg7vuXRDbCOMQM/JFDJW4zHnnHW1osmc8Y3TjLk9v1mjlqKSf0QST01SRj5OCuNs8kWJ3Zn4hxG7jyAGQTXGbs91uWLg/VkaMBdu+sIUHjZT3JiR3qYxYO4aSo+Cp2cJoOH1w3Sz4oaZws4rVMAlHFscV4KNL9/8nMgNEjHb0qSCdGN8PYionXWISoGC47WO5vpOF1ql9ZwXptqsJyasKoydU3iXSehslCmarGel1B3dr6WeMjq8Tw+O9zMP6jI9xBLiJvzQwFEADQi0JXFX20hBUMISAh3wj8dW+1obfUUmgs53XygsrJ5MSkkGbcXrFlq5h1sF5vNEOkGqoiSS34rdQBzWXNKuEafQy4r7J1+D+DAwBzmbQ5Bt67PSLRT/HtlqIi2NsqF8OfIjmRedRdvqGF5YNARm5OgXtKPOueXXUEDuUhN1j6F+iAkhczhJ34BwoA0RS5sJylino7FGmZ+eetGFEpi8k3lMgQi56CSKOM9M4CqtquxB+bwHQBpcC9/u98y+jc/ie8OhgQ0Z5UWWLKfOTah4k/F8mb3ua01QeFR1CYSyxnkiEUq7HJqUHrJiE5NuXjcsE8SFLOgH9eLZcCtm5c5BkcSlai8Cj2CTG3RhHednWOpUAwezjIA1JQOvj0cUj/rdLqvS1m7wUueQyyUoZvNCW6bvpJLcIuNih8rwbbXZjikyPyj1Q64m9kddOC4NMYas8z3aNb3RMRwXk8Reu3T+QyqHzOemvkE6YM DZa49hWd kCseq/ujlhUCO2/ERvDI4agZd51R7mFCUR+RdifqPnHyip113nLZ/cgLv2i/Lph8fxp8L8bctbjZ+ENQ1tDeAPkRM+byZu8b/r65IU7+JpvqX4iRFArt/oxXwXCWQbSZ+XBYwHjvgZxZZBtgL9yMDoObTw8VTIQ7E7ngnCGmTxKdTLldIO7m6ShpnRxQ6MY5aWqj2Wwd53FekhhH4MjeN4XM+oZEcH7H864awgQnyWlTNkcN6KkzxDQHirwIKbE4zLQXk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Dec 14, 2022 at 10:31:32PM -0500, Waiman Long wrote: > As noted by Michal, the blkg_iostat_set's in the lockless list > hold reference to blkg's to protect against their removal. Those > blkg's hold reference to blkcg. When a cgroup is being destroyed, > cgroup_rstat_flush() is only called at css_release_work_fn() which > is called when the blkcg reference count reaches 0. This circular > dependency will prevent blkcg and some blkgs from being freed after > they are made offline. > > It is less a problem if the cgroup to be destroyed also has other > controllers like memory that will call cgroup_rstat_flush() which will > clean up the reference count. If block is the only controller that uses > rstat, these offline blkcg and blkgs may never be freed leaking more > and more memory over time. > > To prevent this potential memory leak, a new cgroup_rstat_css_cpu_flush() > function is added to flush stats for a given css and cpu. This new > function will be called at blkgs destruction path, blkcg_destroy_blkgs(), > whenever there are still pending stats to be flushed. This will release > the references to blkgs allowing them to be freed and indirectly allow > the freeing of blkcg. > > Fixes: 3b8cc6298724 ("blk-cgroup: Optimize blkcg_rstat_flush()") > Signed-off-by: Waiman Long > Acked-by: Tejun Heo > --- > block/blk-cgroup.c | 16 ++++++++++++++++ > include/linux/cgroup.h | 1 + > kernel/cgroup/rstat.c | 18 ++++++++++++++++++ > 3 files changed, 35 insertions(+) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index ca28306aa1b1..a2a1081d9d1d 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -1084,6 +1084,8 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup_subsys_state *css) > */ > static void blkcg_destroy_blkgs(struct blkcg *blkcg) > { > + int cpu; > + > /* > * blkcg_destroy_blkgs() shouldn't be called with all the blkcg > * references gone. > @@ -1093,6 +1095,20 @@ static void blkcg_destroy_blkgs(struct blkcg *blkcg) > > might_sleep(); > > + /* > + * Flush all the non-empty percpu lockless lists to release the > + * blkg references held by those lists which, in turn, will > + * allow those blkgs to be freed and release their references to > + * blkcg. Otherwise, they may not be freed at all becase of this > + * circular dependency resulting in memory leak. > + */ > + for_each_possible_cpu(cpu) { > + struct llist_head *lhead = per_cpu_ptr(blkcg->lhead, cpu); > + > + if (!llist_empty(lhead)) > + cgroup_rstat_css_cpu_flush(&blkcg->css, cpu); > + } I guess it is possible for new iostat_cpu to be added just after the llist_empty() check. Thanks, Ming