From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F679E77188 for ; Tue, 14 Jan 2025 16:54:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD544280006; Tue, 14 Jan 2025 11:54:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D8557280005; Tue, 14 Jan 2025 11:54:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4D90280006; Tue, 14 Jan 2025 11:54:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A35E3280005 for ; Tue, 14 Jan 2025 11:54:21 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2CED5140B62 for ; Tue, 14 Jan 2025 16:54:21 +0000 (UTC) X-FDA: 83006655522.23.80F9891 Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) by imf21.hostedemail.com (Postfix) with ESMTP id 212411C0011 for ; Tue, 14 Jan 2025 16:54:18 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=M3dbpZXE; spf=pass (imf21.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.44 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736873659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LsRaccHvZKr6FJzMPGUpANQeurIQy5ZJPHMSFpAg7jM=; b=M49gGrF+rffPSXGUWp5MGkPS12Jq0lpb1RjPOxgcMSVpaZqySo50DINbTiA4skhwGDnKyD 6qNv1vjqKhTKjJEjG65eIPcakRK1izKFECHnxcIzav7/zSEpHEhO35eeiHO5Zg2NufRqb6 VRliVc5fuRLhfJUtJH2XLy8rDUMJedo= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=M3dbpZXE; spf=pass (imf21.hostedemail.com: domain of mhocko@suse.com designates 209.85.218.44 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736873659; a=rsa-sha256; cv=none; b=SV49iwC1KhRuIJv2FX/HKvMiaTkLluIr399Kj2INw2lNzhQ3n0KB+dMTt8l4buufbqtmqW sSs/g+oBLNe5yKhAa4SxARc8doADzXcPrSyipSMCEjL9l2xrCkwIQcgKgvQZoAWbbRKvwA 7k+CSP2uwRkrFK3/kwWSyXKG3lvmCqM= Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-aaeef97ff02so936041766b.1 for ; Tue, 14 Jan 2025 08:54:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1736873657; x=1737478457; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=LsRaccHvZKr6FJzMPGUpANQeurIQy5ZJPHMSFpAg7jM=; b=M3dbpZXE5i4GHqM5jbG8QxNafatIle848gW15ROyDT8CaiECYqnuhWy8BkThRJT31j lOhBDwVs3sYlGqfQFl+efvRAgUCkQ6XIZ2O6OElS4m0MLHkFCqrM2MD1u+ogK6qeB9+S NJVSSXcwQig/9bT/PQAxcBK4mVC3hu9Y6BgOZiMy6yVm28rrNGY48GrefGhPGrLRL2e0 82ngJNIO1ATx9kkNfURo23di8UVWitnnIplDAg7E3gSR0qTR7kowvV9sFg0PEQPLPfiA fwMacR1UiL+1aHjAeUwXo+Rm06QupUJv3+g9MtjPOAp1zpy6sfOl2czsx2QbbV1/koGt WBEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736873657; x=1737478457; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LsRaccHvZKr6FJzMPGUpANQeurIQy5ZJPHMSFpAg7jM=; b=Nvg3QgTBsOuukAYAutrYBhKYQjNL8c+UJDFTvADY/BSWyPDZwKPaUVhFcahsywlLnZ SjMbBFmmThb9I99vMyD0DsGt8XkemxJjDV5MyBsEBMUwQIPZTQ0w20KghWMX7UAR2DB3 Wo3E9rLsc7/ztKMgWJ7VAOsxSwxCiqNHe1sSVOQW0W9VHPnB9xUwaICjcHJzOa1SNKV9 i8fhDrvr2fK5Ep56LEM/P7GYEZ7o7vbUWhW1LMq7y+OVS1YHm29fWKHndodOoL23/QAm JR/BWtDM8ibRYyJ/eLUYl82/k3zI/UpJUx7FoVLC52bRJWLBjb0HOOneB5vi5GOSxQvy z1BA== X-Forwarded-Encrypted: i=1; AJvYcCXEwtZu25F57loX/6pP34cOOBlooEd3lsA7xqhls6VhCoB8x2FLHMlUHBljcEpgZR7ME0HnhvOpCw==@kvack.org X-Gm-Message-State: AOJu0Yx9pxh9klzRKNc8/uyajuUNy/J9kNhXZcePPMyBIQpmV8AO2/Ms ANEgeErQ0N6g9Ccp+mIc0SYLdFRhb4sI+z7Rc234DceLSXozwgkmtQX5v0FK6oI= X-Gm-Gg: ASbGnctQ4GHzxDtJ3TjMmVF7iM0aA/8FaWBu/Ug9kIahabgSiwEhBvCdVezILMptyRS qU+GC3OO+7V09VGt725A8K6GcCG3cGJD8D/6VeT13AT9SRykCs69kCjm5/ktlgOps07AlTWNuRO I2efpHqQJV3mYgD6aeWMQOkSdjzAdQX6S65ow5jifNr3Bpigwcrc8iim7lo1VYJciZ5In2UcN/5 hH9DbMIFe8DYR3eiFDjhGeNUFwRO5dusoBblmTSM05qeKcZ1Z0nXzKUs9UT5TGDmRDXmg== X-Google-Smtp-Source: AGHT+IE82dJIBD1oYsvhQ/wbzxr9AzjVtFcZvJeySe4F0GuPIoDu+lt8hhocCPFRA+YHPmI6489vFw== X-Received: by 2002:a17:907:97d2:b0:aab:73c5:836 with SMTP id a640c23a62f3a-ab2ab711bfemr2444968566b.32.1736873657580; Tue, 14 Jan 2025 08:54:17 -0800 (PST) Received: from localhost (109-81-90-202.rct.o2.cz. [109.81.90.202]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d9904a4132sm6453355a12.75.2025.01.14.08.54.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2025 08:54:17 -0800 (PST) Date: Tue, 14 Jan 2025 17:54:16 +0100 From: Michal Hocko To: Johannes Weiner Cc: Yosry Ahmed , Rik van Riel , Balbir Singh , Roman Gushchin , hakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Nhat Pham Subject: Re: [PATCH v2] memcg: allow exiting tasks to write back data to swap Message-ID: References: <20241212115754.38f798b3@fangorn> <20241212183012.GB1026@cmpxchg.org> <20250114160955.GA1115056@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 212411C0011 X-Rspamd-Server: rspam12 X-Stat-Signature: 159k3eps6r9hgiobbyar6d8gnzo51qag X-Rspam-User: X-HE-Tag: 1736873658-370103 X-HE-Meta: U2FsdGVkX19ucA7TAgcOBoHnXot9J/PZkkEBPF8qIb3BCt1uh/pGIZqxBYgB12zxm+2ydE7CGnvsAWaHh8tRyFgSMdsBTNLjCfkzRvbf2gsHnsu5c/qf6oVzwbOzn2mMwICkvZ3vMAkJ7lOOmQdijRwUhRAXwGXWTHoAZC2SIPYOE+XkKOe+DYY3KEnY8Uu4R7DeB9tZoG0BnSAzmb4gkcZq2cNnJ0Ng7c6bTj8NJ/hNF9lg4CMZ44EPJaKnlj0PEm7qVzrpWvMLRR4QUz0tlmosSM9jmAHLdvsqhVsp4NcyNZlHfe536Jn1qmO2h4MgA1py2pBhEYjWtiZFev26JONV2dIGAQJ/nmN6bTY1jY7S4WTPS/CKlu5aQw3DxNgl4RKg7uiGpca8NMh4zOEfvRoCYMfE33GQfNuc7HwTnDrgwXkYQ2SzrElJDOPoA4XMJ2JiBhJoyiqhfbWcFnhXAirB4zeScazPIKHW+4qwl92JCtrTEqrDWCTx2MRwjnnQdm47JSYN/kTvERRJBNIZzOJtzU1jFnO8ntosfY18t3nbRQjU2ByiuweUohnKwWRdgIzWzKyS69zzvemTzfV81qcA2cgzBk1CSR7RrXKsHb+XkKWtHcn7ealvhtr065yE2w83D4wS5qx7dWtizB/BcJLBzZIWWEw+f83btNZkIYCp/jtLi6hINMeSFPOsRAuaoUYv2Ai1reJ4rSRutpcdPxg5j/GK36tzcKYp1+ZBqGoK9iwDCShKe+NUtLYqLmwKt6nvTNeJzebB7xXXDJBYy+0huTiicYdi0mz+NUiLd3dSFOAde8jfFz7XBWRZlQ1wU5CSZGicRzD+h3SAir5qbKbGDXyl6atuCu1jKOtlg3ydsP4jprbEqRyuiA6zQEOsJpL2nYHL6fpyTtMCKAJnpgOkZfpl6xW8KjCE8QQ8V8G3KurhfbRN4z24xM5aOoZCoItNd9f14EOsh+JsHGl E2QL/mUE 1bAFt59sF88eZNEqjoCsl+t03kSY4Opj4a0gBlU4QXRmd1tnmk8k2l6w7m8qnbopMb64hUfEjqLm56Ko8GVefw2mHGYz4Q/cDDq+AvaPpG1yglywlkFaVcNHcu3JcOASto6X7yhFYwyvOrju9QEDgXRRxmpVsRGndKrtDKxHPxKnYljkoXGyU6OojtapViD2SgG2ZOWxS7M7/ej3QKB067KBqkzHGFkEZe5Sy5r4jxpPmGuKwPU8GNaYaDDy0xmquwIIlPWkldvgUcN4dA2V9R6a/zBRuyFPDWlqKRsNLOtH+NuA166lByUTqEw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue 14-01-25 17:46:39, Michal Hocko wrote: > On Tue 14-01-25 11:09:55, Johannes Weiner wrote: > > Hi, > > > > On Mon, Dec 16, 2024 at 04:39:12PM +0100, Michal Hocko wrote: > > > On Thu 12-12-24 13:30:12, Johannes Weiner wrote: > [...] > > > > If we return -ENOMEM to an OOM victim in a fault, the fault handler > > > > will re-trigger OOM, which will find the existing OOM victim and do > > > > nothing, then restart the fault. > > > > > > IIRC the task will handle the pending SIGKILL if the #PF fails. If the > > > charge happens from the exit path then we rely on ENOMEM returned from > > > gup as a signal to back off. Do we have any caller that keeps retrying > > > on ENOMEM? > > > > We managed to extract a stack trace of the livelocked task: > > > > obj_cgroup_may_swap > > zswap_store > > swap_writepage > > shrink_folio_list > > shrink_lruvec > > shrink_node > > do_try_to_free_pages > > try_to_free_mem_cgroup_pages > > OK, so this is the reclaim path and it fails due to reasons you mention > below. This will retry several times until it hits mem_cgroup_oom which > will bail in mem_cgroup_out_of_memory because of task_is_dying (returns > true) and retry the charge + reclaim (as the oom killer hasn't done > anything) with passed_oom = true this time and eventually got to nomem > path and returns ENOMEM. SUSE Labs Btw. is there any actual reason why we cannot go nomem without going to the oom killer (just to bail out) and go through the whole cycle again? That seems arbitrary and simply burning a lot of cycle without much chances to make any better outcome diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7b3503d12aaf..eb45eaf0acfc 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2268,8 +2268,7 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, if (gfp_mask & __GFP_RETRY_MAYFAIL) goto nomem; - /* Avoid endless loop for tasks bypassed by the oom killer */ - if (passed_oom && task_is_dying()) + if (task_is_dying()) goto nomem; /* -- Michal Hocko SUSE Labs