From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5D8EC3DA64 for ; Mon, 5 Aug 2024 02:59:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 129D46B007B; Sun, 4 Aug 2024 22:59:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DA516B0082; Sun, 4 Aug 2024 22:59:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0B6C6B0085; Sun, 4 Aug 2024 22:59:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D22436B007B for ; Sun, 4 Aug 2024 22:59:06 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 50C98A1E22 for ; Mon, 5 Aug 2024 02:59:06 +0000 (UTC) X-FDA: 82416685092.05.0BF2C0F Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [95.215.58.171]) by imf28.hostedemail.com (Postfix) with ESMTP id 59B81C0008 for ; Mon, 5 Aug 2024 02:59:04 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=fIVNCdaX; spf=pass (imf28.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722826663; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V8wh6d6ZtqImE6AX90LtdmD9AzPtO3ZLthBz+nRGAa8=; b=EIGjQrHV4zLHCyepFVEF7upWiDDS8DeVyjkcx8c1S4p0B0PdfdgpJAnhBujD/byMxR9X4x ebvyTOLPG6YmJcub8nTNZ+EnxJLzGE59A+9FvxIJd6YfeRm/wxg9FkJiWx0k8HEt7k+ZKF 4cf5fIkDv9psjNvkpNmBxqQPLkhZQmk= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=fIVNCdaX; spf=pass (imf28.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.171 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722826663; a=rsa-sha256; cv=none; b=pQxucX+LEHkWsh3XGzcz5gm+PriZRqVCWblB50pVh5vrc3Eeg6yAH/1CGcI2+zfKmdvxXf sGThGuVZuyQtQBJFUYrWWax7ANZidQElDq9ltwKv/VUVggAWogzTWWYKC1HAScoE+ycfQD 3LH2rgjUeQBZae7aENm6dc4u0/Ru1Ao= Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1722826742; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V8wh6d6ZtqImE6AX90LtdmD9AzPtO3ZLthBz+nRGAa8=; b=fIVNCdaXXUpDe4+FzNILBIyzwEArX9OQvgfH+bkYslOyM6GDm7l8cR85XTISn4vG0d2yEi kHxFWfENnLBK1J53XV8AVKizpJJv2b6SuvUjF/Zl6wTLmEj+uZkD9Y9enyrjduMJU0UHkQ I3It5yLN0w2xExgM83mnMqgrVGQoj0U= Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Re: [PATCH] memcg: protect concurrent access to mem_cgroup_idr X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <20240802235822.1830976-1-shakeel.butt@linux.dev> Date: Mon, 5 Aug 2024 10:58:22 +0800 Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Linux Memory Management List , Linux Kernel Mailing List , Meta kernel team , cgroups@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <39C19964-C74E-4479-AB21-74B7C603CAC8@linux.dev> References: <20240802235822.1830976-1-shakeel.butt@linux.dev> To: Shakeel Butt X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 59B81C0008 X-Stat-Signature: zntmfegaaccgcd135u57qjhs6pbjnu7b X-Rspam-User: X-HE-Tag: 1722826744-850909 X-HE-Meta: U2FsdGVkX195rSEGV5qcizRWJloXjt6DNbxNPvHFVa6DNyZKar7wlfklibt4ny7WTW9N+uKawKaJOojBj0Re+59vXhtqHsZfvPv/vAb045DCGULWPTnXcpVbjo14tq5m4GgGeBVO23phQJ5N/GXSf1mxIa73qRMOypqCcb+lX5Eo3Y5W+uxBuYTz4fOr66MjGIfIzIX2QzSgiubo53hiZffsbGdSnL9QGAPxwlHTE+M53/S2Yypq0JDXwKFoEo/94tfLUq+iM67cgIRs0jyq1nzM0gA6XyZZSwk7r18/FOEsGG+ogZ2rAsch+SHBn+lSaffSb1Ezc2MDlBst8WakuuQ0dq83y1Deo9I2wzr1izogphflUJEALK+i0ko4mqMnl1cUY7RhHiuoOI9yikxLM8+IZKTvjjfmhx9DkhUE9yDdP6cQozI8DN/VuZYNdsBfu74PgfqLGiqmcl1JgKS7zmVY1MuJMD95Nli3ye4mHyE3svCr1knCXqCIF0/nSsede0WxhGeCywHCImTKL61mWR/Hte1lWYgxhVaB+jC4zYRs1Baho1noJtVDOJ9ykfbWpZWAsilX8xU+50J8Zx/xoU1pdNUG6OQfFHCnJz+ThVrPb8sJAgFLqM/38/sbVIkcWhrtYWBBVpcGrDju22XgYqqtpp0IQBTHx6q6iY60pkFAA3+m7H/FtWYkeSd4JUkLK+NECHMFWbH4QJdbxi1/FuiGi5Vp19loV6MyCRja3XWXf2BZvvyQ6mx41K7iLkwgJbmo9GnzdscesYFw4X9/C8eMrxWQRRKsznlF+pHmjMly9ltH8sGJ4nOMG8w0fSOYJfWxp/5774ZvvZCAFEJUqxThmeevCMIa86zTF9dBxHiyloo/bCMbH+Rm54dmqFvS0BRVPTrZyatvDoV0bmEnc3KLNycFr/UoY6tR0buU11WkKTaVXh/0BUa6FQ1z78IJKHaXRYN/UyufOTtAAvz hy9nd4sU 00Cs1TbGCcQsMXehlSHMvi+D2uc1ck6rp0gbA8f/T9lpd/UQW2OlzikwZMfaLPw21N5yAlPwRMxoVGvjEr10P59GAVQJWcngYSV+7A7DBGNi5Qq7OG2pM9wYuUSeJRVJPUj+0BYV0ysfJrnd0sIG54lAzbFHRiySJRUgT7HyANgGrAKM8aDaTC+Fyqq2sJP9pQBrN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > On Aug 3, 2024, at 07:58, Shakeel Butt wrote: >=20 > The commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure > after many small jobs") decoupled the memcg IDs from the CSS ID space = to > fix the cgroup creation failures. It introduced IDR to maintain the > memcg ID space. The IDR depends on external synchronization mechanisms > for modifications. For the mem_cgroup_idr, the idr_alloc() and > idr_replace() happen within css callback and thus are protected = through > cgroup_mutex from concurrent modifications. However idr_remove() for > mem_cgroup_idr was not protected against concurrency and can be run > concurrently for different memcgs when they hit their refcnt to zero. > Fix that. >=20 > We have been seeing list_lru based kernel crashes at a low frequency = in > our fleet for a long time. These crashes were in different part of > list_lru code including list_lru_add(), list_lru_del() and reparenting > code. Upon further inspection, it looked like for a given object = (dentry > and inode), the super_block's list_lru didn't have list_lru_one for = the > memcg of that object. The initial suspicions were either the object is > not allocated through kmem_cache_alloc_lru() or somehow > memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg = but > returned success. No evidence were found for these cases. >=20 > Looking more deeper, we started seeing situations where valid memcg's = id > is not present in mem_cgroup_idr and in some cases multiple valid = memcgs > have same id and mem_cgroup_idr is pointing to one of them. So, the = most > reasonable explanation is that these situations can happen due to race > between multiple idr_remove() calls or race between > idr_alloc()/idr_replace() and idr_remove(). These races are causing > multiple memcgs to acquire the same ID and then offlining of one of = them > would cleanup list_lrus on the system for all of them. Later access = from > other memcgs to the list_lru cause crashes due to missing = list_lru_one. >=20 > Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure = after many small jobs") > Signed-off-by: Shakeel Butt Acked-by: Muchun Song Thanks.