From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6285C54ED1 for ; Fri, 23 May 2025 17:21:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75C976B00C3; Fri, 23 May 2025 13:21:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 734406B00C5; Fri, 23 May 2025 13:21:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64A636B00D1; Fri, 23 May 2025 13:21:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 450906B00C3 for ; Fri, 23 May 2025 13:21:51 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DEBF0120C2A for ; Fri, 23 May 2025 17:21:50 +0000 (UTC) X-FDA: 83474839980.13.B936D6D Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) by imf06.hostedemail.com (Postfix) with ESMTP id D6BE2180004 for ; Fri, 23 May 2025 17:21:48 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of breno.debian@gmail.com designates 209.85.218.44 as permitted sender) smtp.mailfrom=breno.debian@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748020908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=1as6ImIdlFWGVqlVEq7SS2UVNO7GhmhlxXbjC8OXbUU=; b=H4AyKvPqOqW9YJPLRWbkDCuG7VgeQGJs2mVU0cYt6zYiLZjHrMiyAHMr6AHEK9IRiYWABm o+V5iAxFMDctCBAcLozUR37L2S6vpgkqLLGS+wxCUJ9NHSxZz8P8jwdbZ2a2KgPtC7ww7P EMuUD0JGJ8ZrnsPw1e8S9wgfZUy5jvg= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf06.hostedemail.com: domain of breno.debian@gmail.com designates 209.85.218.44 as permitted sender) smtp.mailfrom=breno.debian@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748020909; a=rsa-sha256; cv=none; b=w10t0aUV1OyMQqAqQLQlEFZOvm0P7rrH342csjnQgp0AmjXNeuy/pQQVqIvQ3Jhh9MBC2s 41L7NVemBkv9Lw36T1UQDwkJ3WTbZF/ZZbOjIjjkr4MXgGdX5738Hj0x5tzOdAD/SWh8LQ lDtqkwIkCaDQTMdaicwQbA+Yow0cpXI= Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-ad560321ed9so5287966b.1 for ; Fri, 23 May 2025 10:21:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748020907; x=1748625707; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1as6ImIdlFWGVqlVEq7SS2UVNO7GhmhlxXbjC8OXbUU=; b=R6Wd7o7Ca5VfcOdbEuu51LKpF/TUqZMAH7HZsNpZCRKkucEjRyG9d1RD1+a2iOecAe i8U3H5SjUmCqZYs2tNOFNvAV/I6VfX+VdIw9HivQMPffaHAUGgEcnVJmHqljRSP7ciKH /2+dUawkmgxStA/LsSgngEOQJIkOE55HoOQW/GwstbhZLB22yCqiyvLbIMqNVmRDBV9F 6ypxq0rKL2ovZzgIjzTXJ85KNDpEositxKAqbM40XQKotS4gbyqGacj+pKrTHn+Hslm3 pPdHq+WrYBK55WlTsZOXRPVJYoHF5Bg/LwdB8fHqmDfY5sfqmgZQTjdq7qmDC5D7W0Lk spCA== X-Forwarded-Encrypted: i=1; AJvYcCUMUmRRcAW5L1INqfg5Tfd9hDUW0r3c0reruNbtF6ClEDmj09xJsu5C/nh8QO3hz4ffjP/IGrKnGA==@kvack.org X-Gm-Message-State: AOJu0YzI1LEvVwGybr7lh3OTA/zEdrVCialIOAnqNLg66VFd1GE3Z3vK w7hQWNG7m4rldmtSXj9dUCsR0GhoYaXxG/BBNUkoNQZak3Co4tkHFHwC X-Gm-Gg: ASbGncsGEzpykFRK/6BbA19caEkPu9g0TlFkmzydb1OCn+yLD64l8jnloLDAHEPoiLd rlsbmrS++sRpOHM37eCbci3+/UtSaqa7riHjuTTnYygI7pQ+3eeakm5qZ5eh4nGCNta4cH9mUQX 7NABaiepKRVU5K35vqqU+fjMfWoKwrYlPOWwsvzaDpgRxaU1y2vcJV9WRxecQljJTWa64A2xLrz o+G6cMPg4mk9CNmjYqN5n+8MO1yMds1vH+iV+7UF6SGSiabfBKXiCRUjZ24ISXgFc6SKi4PA+hL Vgoo4EuOiNSFL99bb8MIvhjK6/GHScILy2BMnc9JCwc= X-Google-Smtp-Source: AGHT+IEEWay1bo7xwQtAs3f6A3rSo4pJSW6u6S62+uEQlhJzjpxa/eN/MPuznam0Nb0N36VqDaOo9A== X-Received: by 2002:a17:907:7d89:b0:ad2:4785:c4ac with SMTP id a640c23a62f3a-ad52d575a9fmr3121280566b.40.1748020906956; Fri, 23 May 2025 10:21:46 -0700 (PDT) Received: from localhost ([2a03:2880:30ff:72::]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6005ac333d1sm12197258a12.61.2025.05.23.10.21.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 May 2025 10:21:46 -0700 (PDT) From: Breno Leitao Date: Fri, 23 May 2025 10:21:06 -0700 Subject: [PATCH] memcg: Always call cond_resched() after fn() MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20250523-memcg_fix-v1-1-ad3eafb60477@debian.org> X-B4-Tracking: v=1; b=H4sIAIGuMGgC/x3MUQqAIBAFwKss7ztBNyTyKhERtdZ+WKEQgXT3o DnAVBTJKgWBKrLcWvQ8EMg1hGWfj02MrggEtuyt59YkScs2RX2Mdcy+i62zvaAhXFmiPv81jO/ 7AXG+asRbAAAA X-Change-ID: 20250523-memcg_fix-012257f3109e To: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Chen Ridong , Greg Kroah-Hartman Cc: Michal Hocko , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Michael van der Westhuizen , Usama Arif , Pavel Begunkov , Rik van Riel , Breno Leitao X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=2567; i=leitao@debian.org; h=from:subject:message-id; bh=9SLdynl6L7y6joeNZe/hgRVqbliepI84c5Fual0QYhM=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBoMK6oTefW5BzotcyUzyayF5x6y5MfHcnenlf2J mFeWRdJMYaJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaDCuqAAKCRA1o5Of/Hh3 bczDD/9jW9atkOFws/9Mw2wEqL09mHhEgwb3LKjbIkp3KBDora5mTdsh4EM6VDEQMdxq1kpjpNz GiJtQH7jl1kc71734kzXrt8qJN75Zqyv/pnPjRfRIgAqxKK7UPW/YzanfOzR88LHiUtAbQ8aLV9 4pbpcytSlqBucL9CaCEGLV1PlWFzdaaBwgzyp3t3LOA3bXJEWVxVtwMGAGB6MA4puNNl+PHzRgr 8ymf1ID08isRIHU+TNVMyaeqG5EKSgNlcSGjgGYHm4RMBaIQtLI5RZFcrfX4/zZ1WvJznGdwIqu gChRgIRKgXT0O/Jz4sKsezfnM9byOY8PuVIydhBnPMTi/B6WQR+HdioIXE92SkS/a/cn2nXpTdc tTaPXVLdK8HsEeGrjNlP+ub4dfa387IrUHscHBH9RCoZTa1Vvwmc5sIGSKV3kPD2QJAkEjiZXvS CFlJlm5VvP4XcdMvrrcNl9+8iW8/52UIMlvrfXMPherMbhp7cO5Fj7qyF0seL3GtJ0OKUTEl03B UfdP2gvmtKfltA6ED/SLIuUrvV0R4kyXmKtHkISbWp5Q72UD1PFy7AyVYovEjPNzCgcwNMWkEbY SFHYQAnmna2+xMQkC0N76hbzegoWLAWh2wIIxIwZyzfLgh84GkfNc7yf4RLWJIBwAwSAN7blASR tJd2X4lzFEXJ6+w== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Rspam-User: X-Rspamd-Queue-Id: D6BE2180004 X-Rspamd-Server: rspam09 X-Stat-Signature: pgmn13ydfh64zjk7djidyunxghgk365f X-HE-Tag: 1748020908-445288 X-HE-Meta: U2FsdGVkX19wIMNoKK2MMlJOpudF3Mqv5ErS1BuPFwBhQz2iM/xXnZY1rbBVvR2wvcfOomgGMhZqY3In+1KOdRyQJyM6Q+nPbaBahN/rm1A7N2mphYnGb/khI4ztP/3ptWvqY+IAk3T2TZHvhoB7UfxjeOKLUIXkMm+927xMkWoPZmsLKMg2k/6ohMVsOc2TXB1JknUV9xi90aAWI+R6i95CiZZ5XhG0IUKS1GUTiRacfno7d7n6ozuKLQd8uOXc9huQ919S6+CA3cY8UMeJA0sGX7fYx4WlepBncLgq/8QqzU8JKBCzH5OwuMnGQdpYfW7xM9LKayRgJWCTwdJQyCGdPnZzpBbepvCbFXLVzGMHI9LpkSPjBX91Ehu5nc+AiBIr0h1I6vuVhXt8XU7aDG0NFbx2lBu4ml1nEabvp7C+cJGGgJex9dIwGB+ZEeYlFS/UeqR6QbLj7uZeZragyT/jtCV8Cq6DWDuT3YTKm6tIiibd2AJzsAfwoGRgPvCCLoflq/sBHsaXfSWO2gdC2jgNKvuUvI9LCQztHXTurdeS5hBI04Iee9iDMiInz8kcaHK9YmRnJV/Ba6CxjWOZLyjL7fEIDEBunOqbUacmnnOvQwdjCZ3R5h2x3/5lYycebd8aNLJRWeUWMdyljgadr/0c5t2EdtISotG35kVykzXVNBIiCobXTWfLiP+JUBlo4NGd12XhRAu5X3sbUCk6RjrNu+jDXW8fBqo1PbGkWM3+iqyICkQH1PjffbsxOQe97Vj4L517XmQOwTDp4+cPXw96JV16r+Q7In82ugshjTwBFOjyBjIfb6fdPB0tnw2ImnQoNCaltU0RA1KSJgNIlVWOlm1Cjwbyq22JaQwPc+Cm8q+3hxaDkHU75IZhmbMD1/sexjlJ1UlZmJniHyexGDu7mZ28g36GuudC0UF/xd4HzWS1xaYpv2aOqL8UIKTjS2vcvKBVsMrpDjpbXIE dHqmG7r8 14Pbbrj3Xd6IsiDgNlaTz4UmCnu9dCW8RVuDHiO108CiLVs3Ng+JnJMNRLx1JLglgY2XgBsyoILsM42W78f/sb0PoXbcMMUyz/uX+0dGpJKpD2/vVaRjo81XRfiGV62bXQzNbpkBniY4UNCMCz3CT/56US8hP/zdpEDL7DtGgtqUMnbc/Z/cmaJhys9NfgVNx3K+Jb4521P82QAWb+LcxOFQ/O7G++2Gs+7+0m9jZ9Wn5CKGGN5E6rLEg0FZS67P2uDtGww1wq4LRw4TVe6zUy1sAqU0ZD2lp37Cxq0CIK67tGrr4o1Gpqt1yhou6TDPN539sXqgr9Jgtzq6I7+I81AG7j0dJZZSpvP44B0orDv23IkmKYm+F+IhVgTuaxLNOrIL37hTrn2BHUKRhaCNuQ/Kh8zgFkHe3imwol0PK97FCamizY+lalVdYXrqOqvMWr8IDeKB7ds32AkHju7gM7YyXhMmvt6iAmJBeaxs6BUnKFJs+Qa9G6KCSTb/I89g7vhbyUoZdH7ZPBt8RxdJAnsvWgKVtMRf6ejmYSs+PElF9Q4fTJmMef4pKgfZGS8vSY0WJzb0Ac01nC1zH0rN3llTQvwu9n8UTUim+v5cLuNJWPyIaRF0W/hdQTURofxT6WBvJWSLDKWLiM69kcBzjE6FYMIXM9v50334N X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: I am seeing soft lockup on certain machine types when a cgroup OOMs. This is happening because killing the process in certain machine might be very slow, which causes the soft lockup and RCU stalls. This happens usually when the cgroup has MANY processes and memory.oom.group is set. Example I am seeing in real production: [462012.244552] Memory cgroup out of memory: Killed process 3370438 (crosvm) .... .... [462037.318059] Memory cgroup out of memory: Killed process 4171372 (adb) .... [462037.348314] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [stat_manager-ag:1618982] .... Quick look at why this is so slow, it seems to be related to serial flush for certain machine types. For all the crashes I saw, the target CPU was at console_flush_all(). In the case above, there are thousands of processes in the cgroup, and it is soft locking up before it reaches the 1024 limit in the code (which would call the cond_resched()). So, cond_resched() in 1024 blocks is not sufficient. Remove the counter-based conditional rescheduling logic and call cond_resched() unconditionally after each task iteration, after fn() is called. This avoids the lockup independently of how slow fn() is. Cc: Michael van der Westhuizen Cc: Usama Arif Cc: Pavel Begunkov Suggested-by: Rik van Riel Signed-off-by: Breno Leitao Fixes: 46576834291869457 ("memcg: fix soft lockup in the OOM process") --- mm/memcontrol.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c96c1f2b9cf57..2d4d65f25fecd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1168,7 +1168,6 @@ void mem_cgroup_scan_tasks(struct mem_cgroup *memcg, { struct mem_cgroup *iter; int ret = 0; - int i = 0; BUG_ON(mem_cgroup_is_root(memcg)); @@ -1178,10 +1177,9 @@ void mem_cgroup_scan_tasks(struct mem_cgroup *memcg, css_task_iter_start(&iter->css, CSS_TASK_ITER_PROCS, &it); while (!ret && (task = css_task_iter_next(&it))) { - /* Avoid potential softlockup warning */ - if ((++i & 1023) == 0) - cond_resched(); ret = fn(task, arg); + /* Avoid potential softlockup warning */ + cond_resched(); } css_task_iter_end(&it); if (ret) { --- base-commit: ea15e046263b19e91ffd827645ae5dfa44ebd044 change-id: 20250523-memcg_fix-012257f3109e Best regards, -- Breno Leitao