From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA869E7717D for ; Mon, 9 Dec 2024 18:08:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 379908D0098; Mon, 9 Dec 2024 13:08:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 337B78D0058; Mon, 9 Dec 2024 13:08:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A2FA8D0098; Mon, 9 Dec 2024 13:08:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E31008D0058 for ; Mon, 9 Dec 2024 13:08:23 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 94D2C1605E8 for ; Mon, 9 Dec 2024 18:08:23 +0000 (UTC) X-FDA: 82876204488.13.21DDD44 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf05.hostedemail.com (Postfix) with ESMTP id C24C6100017 for ; Mon, 9 Dec 2024 18:07:41 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=QYSpLRvP; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf05.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mhocko@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733767692; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=knoMYSdzgasoEigK0YXowWD0Yw43nNqQqbSbAMFId7E=; b=v1k5mbsyeGtxh7ZtpLYgvW25nR8wfh908mO02E5W+9LZ6Z76+FSNsUIHQ/tZGgriwkSjQ2 lM4NPUXEcRqC5kXB2Fe3g4IHwvBF78y9aWpsw1N051ZXWJva9CqC/dnbFG/Dc2ofHTk0EK n6H6tGvRYbGeNrWKfotXWf1PPKF48us= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733767692; a=rsa-sha256; cv=none; b=r4bncEcn7ImCdceTXNeeN9512rQOArrWr7vEO0QiinQ/qMPSodAa46S8PvedY9a8tGAJol zwXDKxUyO+/N6/lVc9M1TX1+EmART2sBuRHxdOL+aTM0UftysPi8S3vzu1j+9/JZY43lPz FXqxhMcvclfsdb58i4PycpN252aZc/E= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=QYSpLRvP; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf05.hostedemail.com: domain of mhocko@suse.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mhocko@suse.com Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-434fef8203fso4929825e9.1 for ; Mon, 09 Dec 2024 10:08:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1733767700; x=1734372500; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=knoMYSdzgasoEigK0YXowWD0Yw43nNqQqbSbAMFId7E=; b=QYSpLRvPj08IinD0VuuzUhCGMTcRnsuZJv4r9WUvoDtcY8u+6dbDydHS09V6zYH09T 6XgbfC5wcSfrV4pQjDnkXMMRKrvyHJeuw4hzcdgG6jcRAf8b0b0XgLpRDKmQ40DLzJ1I CvOdvo1bcISK+nSBr45KoM2N4JKoi569c/pYubYwR8XgNlbI/xfTxz4Mwi4dwqWTus7g KNucJG3r0u1+1Mq+NNyqtWswqa0bij4sSX6M4yYQwaNhMO/H7nst4qf4JVStlnl3nJEd BRhvk4FgH7boUlnD5Odzr8bKIX8yCvGSMhB0M2IOkT8zRa+dmQnAVPyNd/egNjqskmEl AEkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733767700; x=1734372500; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=knoMYSdzgasoEigK0YXowWD0Yw43nNqQqbSbAMFId7E=; b=JriDSco5YWKRqLmD51vKNi/D1jt0eEw9VGatYl1ENc1BIug4YELnQh7cbPWXEqybXs adcz6P3CPi+VmXfSwGrL44xHNpnS6MBV550ZBUP5rPuX5Z8ZZHhKgiTB/m18gXKpebJZ i4aMFnLtI9GWnpImoKCY/oak6+SSjQ+DPbloYfJW1EJMLcV4HbmlOTFdVuA+S+8pmmlT L8L+tNkb9dW5wfghRJKd6mwWmCGd14QN06WgnYhA22C4+vVZ04t0wpWCLNdG5YX5pL6T OQDp/tRQdkkAqf49fsaluSFSk28R0Bc0K71bS3DVMMUTXelHBIBRCNAWuewICxugiUfx kUcQ== X-Forwarded-Encrypted: i=1; AJvYcCUHcpwbx3/8tu0AzBRu2UsTYibtmrWwlgQ4S5bmHlWRzXZneB0zdoZOjvfGWVV/IEiv5Wy9Y71yEQ==@kvack.org X-Gm-Message-State: AOJu0Yx0Nv23ZBv5COIgeD0j5kAv30RnTgsqQbOd20u8wqI4lOjnOBpC e1BJ/YnFfwS2J0UA8+2eG0y8XJ9CID6zeOV2x8nPaTQVDxdiRyOQ9ddDwsKvQ5UCPUY8rJPrt9U 1 X-Gm-Gg: ASbGncsaWbGZQ3lP9NR8JMIWDOL9eZCib8bzK1r+rqU6HKKd36Gv9/HTtaZ9rDSwdi2 H5CotTP6RNTp4/1yMjM5/sLXA2KMnaV9nzBfoeOEUFPrgyGnfKYDpxhVcyJiG1u+rfVHjFu0zYw 6szPlP8/0y4/za32ibBN04ur1R44+N8abMcl/BLwn5AGNelgFUmz6BO6Hwo0ZEudxEG/e4yK/sh kOQmYbYvRcq4q8mAPA2h9ZGqA6zCktKvCs+PpbYn21Uu6QdC5OuHQ/ntohtpvc= X-Google-Smtp-Source: AGHT+IHCTcISOZZqgK7Gjuef8mOP8E5E5f9tc/AxJWU+swbLtQRVrk59K5349YN57JJQmurGW/4aHg== X-Received: by 2002:a05:600c:3b99:b0:431:44fe:fd9a with SMTP id 5b1f17b1804b1-434fff98d7fmr14829555e9.19.1733767700211; Mon, 09 Dec 2024 10:08:20 -0800 (PST) Received: from localhost (109-81-86-131.rct.o2.cz. [109.81.86.131]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-434f123da83sm80119745e9.29.2024.12.09.10.08.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Dec 2024 10:08:19 -0800 (PST) Date: Mon, 9 Dec 2024 19:08:19 +0100 From: Michal Hocko To: Rik van Riel Cc: Johannes Weiner , kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org Subject: Re: [PATCH] mm: allow exiting processes to exceed the memory.max limit Message-ID: References: <20241209124233.3543f237@fangorn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241209124233.3543f237@fangorn> X-Stat-Signature: hask9b77mxfxhcxh7r6zftz41hzfzzmh X-Rspamd-Queue-Id: C24C6100017 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1733767661-404551 X-HE-Meta: U2FsdGVkX1/lsS0BjwIAqx03lXh7HplxrX/aKF/8+ujEB8FtllkJGcDxHH0FBPtThljLIxOIbO6mdp16UkfIl487d6Mpopw+yi8vaOdOTlq1LVQ+wj3o4H5FlS3irQ+W0b3cMsRnPaEHGtBxFf+neSps36MQMPsMQnALAWL3cQMasZxm7UmFJG/E5EWWHnfqgvtUyoVXvuH1cSIaPfnk3fyWbrCqncI6Ao+TtuqsYDz0PqguYFvM86q4kp6EF74Cr4AFj8YiF8OU/PZQ1K7B83uUE4L+5/+3DDlKPwyRVEBx0rR6YD4vV1FaR30TiKCX/YueWPhK6RUE5098hhfKarXXMNQidJ80sQmJi4j3tsGvkldyb1torYdVttPnnjQQsRAKIvIwt32IbR+MQIc9QoXEtcwA8Oa4KTM8KdxbdxhtwdI8hSiwP844qhZ49y5D7iuu/ovmXZRUEv/gfyK87wlVQoHsFO8tnqIW2uGZ0XUu3FmM/MIjzLaTnDgK6hknNIMfgu6vNXp5o8qysYC3MU5ZicTWehJxgJxAEDP7/eYrVc4teWwQbZcKND2Ou9Cdecqj0/qmIqkpCQvobOhPvMa0IgnKgSNQ/U6YvxQW3j3qeXLKcO664wpJ8CXmwIRkhfRfVfNYy3ahOPYG+h1VIN2Upv32vII4M8UDiOCdCg13MxYYL3+JxsiWuGDdgrfbjEu74kW/5T6vVLDrD1jwuj8MJk4OJTBpMDZ+BzkSvZmZG8CS2rpjE+2UFBPrA/8fFCNr915WE+MGWzKs8SILRL4jh8Wtjzf3DBCcS8ML4hrzCDn7PPX930y3UaWl4khJLzVLtyZinb8HQ8bq8clMtO8BlEQpRrzjHkeD+XDAQXh7jtJH8xHasgH2ef/0zTAAQry/d4q0WuDmgwvlWcgArmcoX1bfT8vF5kvh0p1TbLRJm5Yz1fiPKgWZQwQm28/vjXRKBlU4IwPYDSp16Z9 gzr0Nl3w AM1RUlG8EjJYZljt4sIIT5wvh/BgrqW0Uiyd61vWpPTaLV6CRuZHLdk00lXuGXleXCn+Uv59aUqIRSuucIKop9vLlIDHZwq3+y4ftnBLw56F0yK8DEF+Zv1OizuCSGIzAmdJPJ9ZBOcsjOyNNuJEy3evNdZF3JKrl2d9Q+K5tcuf1E6uZUPiLdABVtso8tAnH6sER5awklybkdpUbxFL5LH6TsQ4WPkNXNDhXeeTlyeyBbgMRAoXfaWcJkjA80YPAtFEzRBlRpcY6933iWHcl2BcYVm9NkabdRFwnIjKlToDbsNaHarNcPCzWrcfzYdK8At36zwIf8ki9xNs/LZq33Stejh/TxbBqOO0x7trfWKvRuxqkR/TkbBnGJQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.060990, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon 09-12-24 12:42:33, Rik van Riel wrote: > It is possible for programs to get stuck in exit, when their > memcg is at or above the memory.max limit, and things like > the do_futex() call from mm_release() need to page memory in. > > This can hang forever, but it really doesn't have to. Are you sure this is really happening? > > The amount of memory that the exit path will page into memory > should be relatively small, and letting exit proceed faster > will free up memory faster. > > Allow PF_EXITING tasks to bypass the cgroup memory.max limit > the same way PF_MEMALLOC already does. > > Signed-off-by: Rik van Riel > --- > mm/memcontrol.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 7b3503d12aaf..d1abef1138ff 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2218,11 +2218,12 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, > > /* > * Prevent unbounded recursion when reclaim operations need to > - * allocate memory. This might exceed the limits temporarily, > - * but we prefer facilitating memory reclaim and getting back > - * under the limit over triggering OOM kills in these cases. > + * allocate memory, or the process is exiting. This might exceed > + * the limits temporarily, but we prefer facilitating memory reclaim > + * and getting back under the limit over triggering OOM kills in > + * these cases. > */ > - if (unlikely(current->flags & PF_MEMALLOC)) > + if (unlikely(current->flags & (PF_MEMALLOC | PF_EXITING))) > goto force; We already have task_is_dying() bail out. Why is that insufficient? It is currently hitting when the oom situation is triggered while your patch is triggering this much earlier. We used to do that in the past but this got changed by a4ebf1b6ca1e ("memcg: prohibit unconditional exceeding the limit of dying tasks"). I believe the situation in vmalloc has changed since then but I suspect the fundamental problem that the amount of memory dying tasks could allocate a lot of memory stays. There is still this : It has been observed that it is not really hard to trigger these : bypasses and cause global OOM situation. that really needs to be re-evaluated. -- Michal Hocko SUSE Labs