From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3B01C47077 for ; Thu, 11 Jan 2024 16:50:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 20FE76B006E; Thu, 11 Jan 2024 11:50:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1BFB96B0096; Thu, 11 Jan 2024 11:50:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 05ED86B00A0; Thu, 11 Jan 2024 11:50:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E59BE6B009E for ; Thu, 11 Jan 2024 11:50:41 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B056AC012B for ; Thu, 11 Jan 2024 16:50:41 +0000 (UTC) X-FDA: 81667619082.16.9B09069 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf29.hostedemail.com (Postfix) with ESMTP id C7A3F12002E for ; Thu, 11 Jan 2024 16:50:39 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="cl1/1KVJ"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3XhygZQgKCEs5unxrryot11tyr.p1zyv07A-zzx8npx.14t@flex--shakeelb.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3XhygZQgKCEs5unxrryot11tyr.p1zyv07A-zzx8npx.14t@flex--shakeelb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704991839; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=coTF8H6/VApCNyDZSyKgzvq0UwcwbIR90wms3XVyTCI=; b=7geeKWAANCyU6gGkhypNerEHsdwx0NCYBLS6/3IbF1uYqMtfyPBr+1mk7XDP/cXGXuR3g8 g0TS1r8Lobhy75Odtg3JFmKGISSrcxFgsf+1yZ3xnNYZur/w3icfqf4GyKTR6Oyds8Bk78 03BzsGHZ/ieCBESeV9nXmYYNo9OPikw= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="cl1/1KVJ"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3XhygZQgKCEs5unxrryot11tyr.p1zyv07A-zzx8npx.14t@flex--shakeelb.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3XhygZQgKCEs5unxrryot11tyr.p1zyv07A-zzx8npx.14t@flex--shakeelb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704991839; a=rsa-sha256; cv=none; b=DlUIoWCG6KpracIKNTD8XeBGQxbxKDXhTPkMJGJqGQ4SqrFlLukHwJTt9SGo8cdxM/qu+E llvF/sezn/PZg0jzXUxPsCvTlyzNqTQzYW/yjVvRXOtFpmbkfrxfLJqHMEktjKb1ujouxh ZOtxRtjbO3i2T/SZqCEUsYR45wde7mw= Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-5cdfa8dea37so4724216a12.0 for ; Thu, 11 Jan 2024 08:50:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704991838; x=1705596638; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=coTF8H6/VApCNyDZSyKgzvq0UwcwbIR90wms3XVyTCI=; b=cl1/1KVJmXDh4JdN4/nj7c/RYg8+x4EaeuZlKblib8zb8+A+k78RVAW+fhuZcgNQUY N7V071rh+6vFKwjAJFET/hY8PcJ/aItX5zevY/m/EBncC6YP3U50JuovtFxUJA54jskT gvG8JBVpyjYJ7m0kjE8ycyf8se7FWzj1GHVdy4T2ItlDd+pRFgAIJZxppNdCn4HLCzQS M3ifpT9654NJKA6RjE+2Pc3rIKQFP4uHwDOVfuA+ibV2xdUSgKDwgfnQ9N33+9Q97sOw wK/Iu1WUe+5G6lcJSWZ1hm5TdJ4LuLgeyjpJwoIlIvNrx1fGc6bgG4Jz8RB9OIPjbV3z Rwpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704991838; x=1705596638; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=coTF8H6/VApCNyDZSyKgzvq0UwcwbIR90wms3XVyTCI=; b=EKaNdVNrynjvF+dCV40DgZ8bhuZv0UZzibSOGV6eRovjRhpMSVeb6ycy5gWqlNhUvr 6wC4IaaeabsHI4oVHLhA5rhxwPCxuAbLI1OxRXmFaklUTeTBGL0b+3093+m/RoFjfJAy HGUUfjpAufYtDOo8Txi0wcCsPHbwxjV3KnhAgNDH8NnAhFQyHNb7wQ+lOBHvdce3xF32 eduzu7MjcJv/JPlnCDc9sypGTn3w0KFR48CYCHc3H8GRuDYA7FCTCNi905DG56YJw0SD GJ92gYtEKiWRHlR2lyxjpYc3VmsNTM4mFvsKnqPm3/7w2AVDPO3eYsYgrjgvCw++PdZi 2F3w== X-Gm-Message-State: AOJu0YzwoKxwngPW6pneyqL8lCcSPsHvHtvUyxvdrw3v/o4StGjWCz+m InyqqhgzgiIgwglPJKKYQyl0ppLtWclmHAMpyud6 X-Google-Smtp-Source: AGHT+IFCpeT8TO0CDr+kdjYeWOkXPtp4h+2DY3GsSHKqrWWWTnbbH8sY60YGOWIj1fasgxU58Xbgi+/mtAQHcA== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a17:90b:3d90:b0:28d:ba07:8c2 with SMTP id pq16-20020a17090b3d9000b0028dba0708c2mr3992pjb.1.1704991838589; Thu, 11 Jan 2024 08:50:38 -0800 (PST) Date: Thu, 11 Jan 2024 16:50:36 +0000 In-Reply-To: <20240111132902.389862-1-hannes@cmpxchg.org> Mime-Version: 1.0 References: <20240111132902.389862-1-hannes@cmpxchg.org> Message-ID: <20240111165036.w2qbetwrxb2mcur4@google.com> Subject: Re: [PATCH] mm: memcontrol: don't throttle dying tasks on memory.high From: Shakeel Butt To: Johannes Weiner Cc: Andrew Morton , Michal Hocko , Roman Gushchin , Muchun Song , Tejun Heo , Dan Schatzberg , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" X-Rspamd-Queue-Id: C7A3F12002E X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: spjrpbh3z3ww8gbxjxikaqbs5ihouthp X-HE-Tag: 1704991839-204969 X-HE-Meta: U2FsdGVkX19dvHONgmb9GqF4DGJjNATa/1KedqV6ULwsj9X/SBiRBpRk/TStX81f6FIHXJkHIVvzPA1lk2AXEzddHImeR6U+K1ZA5TKELlPdjsuETnzZKVNOtxRKpz7Yoa+QJ7X1F+8/VELEu7mrss3sqxd3n3UtCG2AWTgK2Oy2AWoo/dU+gtsKzXS69cFCbi0MaRP0UVZsWhw/e4RJRQzjHOIl+5t/mLe8LlHJjQWtghSZdK2wAnEVUJaEMleb8ZlftggHV23b6vYSsIZjJdFv0ECzFiJhC1FWQEg7w+n0M3EdQ0eEYHRYrVLIuzzvNQgF70AUze8Ia4YMMDsQX+iYsy3hsxA+GQGPYUMW1K3UIWRFKnNOiBpy3Wwh4UaMXBXK6pfOuamK1a9JIvLmHhZjvMy7p55Gi6Dz5P5TXk/nMziakqSflJmgYieh5JeX8uFhjrWyDyJsjOsJ7TtSnDl0do5CVDu/wgpI9FnJc4mv9VVMJUxz6pqlnHm+jEtz+eSxTHCMxeO88KZghTB3+LKNIFQl66WvQYmoSUXpJ0GXvcGZg7eA22N4sQnpluzDouBm+E/riZaDyupMgcSdHVb9DVtOtlsj/wc1QLraGVFMdGCucaqFa0QrMQAs7ieNeoTwjVXm7/UjW5RoQ5AdpOHE4l8imPuiE5W8Od48N/bS0AQb7Pw/lm5XJ6wPMNo6/2ArpC9VJSNtmpeWnKbF09OcmK90GcguYnMzBD3RA0y9n8T4CnEv98M2BYAZRvtmUD9+Ow8h1TBW9VpXvt+0eE6UATHV+RNEEwc12/bkKQyGw5lZynkyvbmEzxvh+SG4Xk0YjZwwA4Q18ydx2qSy5NJ5/ta3NPgPvRG3eT34gbo+PL+Sayao5zO/P7UmtpeKdw34m5WTA7C2XIuAZAELX42PmO2JdczJCQlcjVaXixKcm1eMcZf3aEglS3Kf46puvu+4OO0OKCUhEILw4+e FcEC3OMN gghYrfF1ycqRK/olcDIz7MwZfdIKgarPx68ZSLKf82GDwdcMCmLuPYx5vf1gVu/jz4Zv8gsCCDcTElIMs9J+jJqoDNEURHI2FL4ovVy3f3utX+IyWJCzLQwI848n39+wO/s4yomeIKLT4BgbCEubNdfcLoK4l3stMypjBv0/SZ7WzctEEgQjHztszVu5n+uym5SMTAipGj9baeJiwBIXfL7zgKtE5eu6GD4iAUEIWfVTgJ4HmxekLR+sQaeQ4zAJ7aYVgexcMrMgSM/uyKQVxkVPOowGcsWrv62VEqvSUycdx7lnhsxvFgTVR10/TCFH1bbW2g4bfboBR4rRc/Lies05HIsGffMPicT2XTCnDNHYy+XPLupAh4B/VyR6c39Wqagwnld1VOMzNp3r/JOOoHN8MOv5I32pqWxJOZOOvePILGBzI+vTI6oaRYL91VZV02CBG2k10H6awdMbqbS3GZZgrf8gmsPpyVhf4Px8FXMWAhVY2HIVV/aeda/92YAlB0kkvdnDiUhBLrOBdlernAXT/GzGig7+tzjssE7+p4k3EWL7HmcaNGhsxmui/gOTMq+0Osxh/NryVRuZTIgc2vRFuJeN1Afsykk5ryRjtEi9B4mWDrCqJPELx8qDUvsTo2TNgjZfLKA8a9g31XdBiCLmuvpriaC4xkl9m8JrAi+ih6QDKPC1un1foCVkzPJxp6LrKwUCr0iuBuTr8mKoNCEawt1nZY/UWccgM17Vu2iwIwd5xBfOMyqn3SC/hboHdB+ZL9LXH49rbtY3UjSuHrtt13+oRVMci2a3rI5b9Mgtxs1ZKEuVxY6/VGA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000245, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 11, 2024 at 08:29:02AM -0500, Johannes Weiner wrote: > While investigating hosts with high cgroup memory pressures, Tejun > found culprit zombie tasks that had were holding on to a lot of > memory, had SIGKILL pending, but were stuck in memory.high reclaim. > > In the past, we used to always force-charge allocations from tasks > that were exiting in order to accelerate them dying and freeing up > their rss. This changed for memory.max in a4ebf1b6ca1e ("memcg: > prohibit unconditional exceeding the limit of dying tasks"); it noted > that this can cause (userspace inducable) containment failures, so it > added a mandatory reclaim and OOM kill cycle before forcing charges. > At the time, memory.high enforcement was handled in the userspace > return path, which isn't reached by dying tasks, and so memory.high > was still never enforced by dying tasks. > > When c9afe31ec443 ("memcg: synchronously enforce memory.high for large > overcharges") added synchronous reclaim for memory.high, it added > unconditional memory.high enforcement for dying tasks as well. The > callstack shows that this path is where the zombie is stuck in. > > We need to accelerate dying tasks getting past memory.high, but we > cannot do it quite the same way as we do for memory.max: memory.max is > enforced strictly, and tasks aren't allowed to move past it without > FIRST reclaiming and OOM killing if necessary. This ensures very small > levels of excess. With memory.high, though, enforcement happens lazily > after the charge, and OOM killing is never triggered. A lot of > concurrent threads could have pushed, or could actively be pushing, > the cgroup into excess. The dying task will enter reclaim on every > allocation attempt, with little hope of restoring balance. > > To fix this, skip synchronous memory.high enforcement on dying tasks > altogether again. Update memory.high path documentation while at it. > > Fixes: c9afe31ec443 ("memcg: synchronously enforce memory.high for large overcharges") > Reported-by: Tejun Heo > Signed-off-by: Johannes Weiner Acked-by: Shakeel Butt