From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE9D6C3DA7F for ; Mon, 5 Aug 2024 16:59:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4B3526B00AE; Mon, 5 Aug 2024 12:59:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 463FE6B00AF; Mon, 5 Aug 2024 12:59:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32B8E6B00B2; Mon, 5 Aug 2024 12:59:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 121FE6B00AE for ; Mon, 5 Aug 2024 12:59:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B7E481C55EC for ; Mon, 5 Aug 2024 16:59:41 +0000 (UTC) X-FDA: 82418803362.23.5804B2D Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) by imf04.hostedemail.com (Postfix) with ESMTP id B7ADE40011 for ; Mon, 5 Aug 2024 16:59:39 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=0JcBlmg7; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf04.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.210.48 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722877111; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4sY3wVahkeTjr7y3XzuhW2Q3UgOPrPRQ13FjqmX75pg=; b=ODZxlgYeoDXsxjx/9kp990kh53OXdTjAew/mzsgFRTh3Jl9dsUF8qwyb8bDPp7KQZfMXbo wd2AFAW+9BS7r/7jcHK5+hW+cA86d59jqZlvUxWxms5mi7Lwvv3ncBVbScKoYW+nErvKil k2JASMlXuzM5qfmk3K1dlNPou84Sn38= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722877111; a=rsa-sha256; cv=none; b=gU+1zYO8dldP9Afo8gpSBzGF0ZW+80OIrsOP93gw52ZNNSs24Q3UM7CrHQAAirn0idewh3 PNdInSEiYYWhqUUemZgIaQNKKLTm1IoTUkRe2BMnImAXGiiuA87gXkTd495+ZG/w00e7ny 3Av8lW8v2Ua6EFT65Vxo4bwFIecuYV4= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=0JcBlmg7; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf04.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.210.48 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-70938328a0aso5299607a34.1 for ; Mon, 05 Aug 2024 09:59:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1722877178; x=1723481978; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=4sY3wVahkeTjr7y3XzuhW2Q3UgOPrPRQ13FjqmX75pg=; b=0JcBlmg7SmZFOSSKpxF6F1e4aYMfGqjHCYwo2DYHWAI7e2q4LPUtkmWDvtJcsBdGls YoW+EPZ8U4eymns9jfqCKYOBnlXBS46csByhEQ90bu4n63lCXyXFChARfTlebWWg++Zk 7ca9RFfY5y1qQnkHWTs7o2JkiUNNoyylmUAuyKev4YJ0PZn5q5rQnbT37VC7eSBoWPME 2xiRiPRUATjS/Oqb+HYzV8LYTAqe2TyuLWseH2JHdAJHxRWZ1MbucBB3VJ244kF2x6jg Ws3OXrWNiO9sy29AdPjdo2lcMM1obphjWKun6R6USjbOY7DYA+CUTHY1TyyTurp6UKw+ TrkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722877178; x=1723481978; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4sY3wVahkeTjr7y3XzuhW2Q3UgOPrPRQ13FjqmX75pg=; b=G/lfOdSgpom4VjQ30Oc2RrzmhB1pyJFMy3ju1dub6TwFNd5O9dF6Tu17491Llabzz4 GpRm04l+tggJStA00ba10IjEbq+60qTbw8j6Rijl7lw+FT8nugivLTwQOv5D+CYqkKel 6yVXiC2+EhV5Mqfp9Lmh9pt2cJBazGMre2TgIcijZ90f1SMR73N8a77Ummi17elkMrby ztjq3BlrGci2tz2h5cPJNl/QNw6G+4GyM62otIhdcCrJXxKqZlkysos5yk4CfdqHTsi7 8HOILDCPzHq6XjFDPvsaUnIV20Tm/4BkWDi8/WRR3HEIuKpQOjfDErAnIDg3y3oAfzax eM3Q== X-Forwarded-Encrypted: i=1; AJvYcCW1gSeqwIgXuqLOPkVAti6/9tRw7FrJdRGgxdPltmn7InYdLPCxKlq79RHR+QLPSEWv0WzVnrt1fvlGjkQW6Ga+aXw= X-Gm-Message-State: AOJu0Yw20v4iJVcBn7e6DBn46//Anr05vdtgJlLL/zeTVF7B4dEftPMa ajqeSzZ/CCtmF+Yxt/fRXe0MA/BmJE2figmtnSAxc2sHGXsdh3XigziEPJ2Q530= X-Google-Smtp-Source: AGHT+IFMxRsrpmwDYJ+JtzAr+Cs0xEeI2ll7tUvkgy3WI1DpJ38nhDEPZEy+gLiEAOohp95KGIz6Gg== X-Received: by 2002:a05:6830:b85:b0:709:4c6a:b98a with SMTP id 46e09a7af769-709b996d713mr14582748a34.30.1722877178571; Mon, 05 Aug 2024 09:59:38 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a34f78b7a5sm368699085a.122.2024.08.05.09.59.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Aug 2024 09:59:38 -0700 (PDT) Date: Mon, 5 Aug 2024 12:59:37 -0400 From: Johannes Weiner To: Shakeel Butt Cc: Andrew Morton , Michal Hocko , Roman Gushchin , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Meta kernel team , cgroups@vger.kernel.org Subject: Re: [PATCH] memcg: protect concurrent access to mem_cgroup_idr Message-ID: <20240805165937.GA322282@cmpxchg.org> References: <20240802235822.1830976-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240802235822.1830976-1-shakeel.butt@linux.dev> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B7ADE40011 X-Stat-Signature: zw79pd15g4xhnc4p4fgjp4bq9r4ts511 X-Rspam-User: X-HE-Tag: 1722877179-495178 X-HE-Meta: U2FsdGVkX1+Z5YPGPsRz5YuL7ggq9cI9LzLyHq3amRPc96zltnDTZ/9Wt8/+6amvvtRdfgzkK1g/MWrZCzJ2MKzXu/b1HfnbKLrVZ5j5n4I3k6zoKOisD5HwDKc0pXwrSBkONLYqhUMC0RGnc8UuFL/R+ZgHvZM/e54WjlYrda6WEOlI0zTfJPA/NQvrcGZV3bcTFNedypsUrvdOotciu3Pb7eDzOZ7aXdmF7f5HfXDWwJJ6QSISkq1Biys1rDA1cy78RepnaZXA7nWvf9qBHUDAQSlIbtHGu4H3+1jNWgDZicU5g/Z9O6WGhzOrPAabw0CFpBsX3Dj9Bbc2MKwjcf9AKj8UeBCv+RdC6JpBUklnGnr0jfxRooON0mZorgyFmEq9pIYjXOqqYgt+w4zqe/T7CPNUJPEFFoX04AxTNFvUhrvX0vwmmVAnQUmRZezYyw2SsSv1V/kYXekqk0ZT9ZOESnAFlnRTS7vtnIm4nZIPBo/rSRpde9BWUgmSaFVUsII7WqKHrRT5amKEp02oYt54Vs0HGt60PKCkodrykzJrNWMNt4QQ5IAnvcrH44HbbfR6+kyfACuzm+4qRQR/fzG827BWT0wbUr73IsB7+HwSqbi7KnHIt6fhlrQooO/ugAB8HHudUk3nu5gDKN4kDCUBJjzlgKCPnoO7ZbyL6qf3hDCcI22GnzWnODlsE7jz7Couho+FvWBuxbOWo8Ox+yeyaSu1eb9FW+XkutZhgGRSU5i3hjBK8qZY2QDTl8nbKvcMdcjNf9pI7q0TyW8pqyoXmsIjDjJj+v2kZltDMaOZBC4vm3VQ5FRFeHaWjplndt2jWvTtzvY2jarIDkEHX7+5CIWdi9fVvhshDP7P80vM/Mk/6QkMAkGh+9YtdKj1yUS/UcCRQifWm5IxKBVxPW4gak7WRgmPt1PfYYCBBcuGEEgbv3jqQ1BkxviE5uTgeIo5WGqKDXIc/PuJVZX ywCMbkrj eUw9thin6EstZBzO9PQ7b08FRwtutbk0Y7E+JHV8ytAdDnbec94UU6CPtLq4FqTvy2infGico5t/MDHi1+khZN6Obw9EW7pj5vgRLOZOMin0MJ/wLPuXM160eh3lVcRXhbyChX7YMhoGh3WAlew48LJj4msL20pX+ht6NixOUS4MXHlBJuE9ubJdXdUp62xDsR8Ru0/GXMFylXp7z/HtbmJYfYnmVu8arEk7nov+AG6SrFowd+f7B9NGZ6irR8rm8SbxoeEdui41eA+0fq0VM3CNXn1NjNFTZfZ3Xp3SCDbRYnGBf5vRH0ldWaeTzI1h7FnyCNyc2Ts1jAWOzEU0vQIr3V0UVz9aPTyB7YNNgB0UyTV9KDY7PpUhGhSNycndKe2uO+nQT5D8dgmHFKhB+31Ev0J6IgESRFWa1XN9Tj1Dj8fVbfCc6RhJu/rL3P9jgup4XoelyaXMPwMVLpUJBJPRqD/VitvouGmNCzBH0/n6Qj6eqKv+kDk7kqNyfETcOC5gl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 02, 2024 at 04:58:22PM -0700, Shakeel Butt wrote: > The commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure > after many small jobs") decoupled the memcg IDs from the CSS ID space to > fix the cgroup creation failures. It introduced IDR to maintain the > memcg ID space. The IDR depends on external synchronization mechanisms > for modifications. For the mem_cgroup_idr, the idr_alloc() and > idr_replace() happen within css callback and thus are protected through > cgroup_mutex from concurrent modifications. However idr_remove() for > mem_cgroup_idr was not protected against concurrency and can be run > concurrently for different memcgs when they hit their refcnt to zero. > Fix that. > > We have been seeing list_lru based kernel crashes at a low frequency in > our fleet for a long time. These crashes were in different part of > list_lru code including list_lru_add(), list_lru_del() and reparenting > code. Upon further inspection, it looked like for a given object (dentry > and inode), the super_block's list_lru didn't have list_lru_one for the > memcg of that object. The initial suspicions were either the object is > not allocated through kmem_cache_alloc_lru() or somehow > memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but > returned success. No evidence were found for these cases. > > Looking more deeper, we started seeing situations where valid memcg's id > is not present in mem_cgroup_idr and in some cases multiple valid memcgs > have same id and mem_cgroup_idr is pointing to one of them. So, the most > reasonable explanation is that these situations can happen due to race > between multiple idr_remove() calls or race between > idr_alloc()/idr_replace() and idr_remove(). These races are causing > multiple memcgs to acquire the same ID and then offlining of one of them > would cleanup list_lrus on the system for all of them. Later access from > other memcgs to the list_lru cause crashes due to missing list_lru_one. > > Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") > Signed-off-by: Shakeel Butt Great catch. This has been busted for ages, but the race is so unlikely that it stayed low profile. Acked-by: Johannes Weiner It probably should be Cc: stable as well.